Bridging CNN and Unmatched Visual Tasks
顏水城
新加坡國(guó)立大學(xué)副教授、博導(dǎo)
Abstract: In this talk, we shall introduce two recent works toapply deep learning for those tasks on which deep learning is not so natural.
1. In this work, we propose a flexible deepCNN infrastructure, called Hypotheses-CNN-Pooling (HCP), where an arbitrarynumber of object segment hypotheses are taken as the inputs, then a shared CNNis connected with each hypothesis, and finally the CNN output results fromdifferent hypotheses are aggregated with max pooling to produce the ultimatemulti-label predictions. Experimental results on Pascal VOC2007 and VOC2012multi-label image datasets well demonstrate the superiority of the proposed HCPinfrastructure over other state-of-the-arts. In particular, the mAP reaches84.2% by HCP only and 90.3% after the fusion with our complementary result in[47] based on hand-crafted features on the VOC2012 dataset, which significantlyoutperforms the state-of-the-arts with a large margin of more than 7%.
2. In this work, the human parsing task, namelydecomposing a human image into semantic fashion/body regions, is formulated asan Active Template Regression (ATR) problem, where the normalized mask of eachitem is expressed as the linear combination of the learned mask templates, andthen morphed to a more precise mask with the active shape parameters, includingposition, scale and visibility of each semantic region. More specifically, thestructure outputs are predicted by two separate networks. For a new image, thestructure outputs of the two networks are fused to generate the probability ofeach semantic label for each pixel, and super-pixel smoothing is finally used to fine-tune thehuman parsing result. Comprehensiveevaluations on a new large dataset well demonstrate the significant superiorityof the ATR framework over other state-of-the-arts for human parsing.
Bio: Dr. YanShuicheng is currently an Associate Professor at the Department of Electricaland Computer Engineering at National University of Singapore, and the foundinglead of the Learning and Vision Research Group (http ://w ww.lv-nus.org). Dr.Yan's research areas include machine learning, computer vision and multimedia,and he has authored/co-authored hundreds of technical papers over a wide rangeof research topics, with Google Scholar citation >13,000 times and H-index50. He has been serving as an associate editor of IEEE TKDE, TCSVT and ACMTransactions on Intelligent Systems and Technology (ACM TIST). He received theBest Paper Awards from ACM MM'13 (Best Paper and Best Student Paper), ACM MM’12(Best Demo), PCM'11, ACM MM’10, ICME’10 and ICIMCS'09, the runner-up prize ofILSVRC'13, the winner prize of ILSVRC’14 detection task, the winner prizes ofthe classification task in PASCAL VOC 2010-2012, the winner prize of thesegmentation task in PASCAL VOC 2012, the honorable mention prize of thedetection task in PASCAL VOC'10, 2010 TCSVT Best Associate Editor (BAE) Award,2010 Young Faculty Research Award, 2011 Singapore Young Scientist Award, and2012 NUS Young Researcher Award.
Learning with Parallel Vector Field
何曉飛
浙江大學(xué)教授、博導(dǎo)、國(guó)家杰青
Abstract: In this talk, I will introduce our recentwork on manifold learning from the perspective of vector field. Unlike graphbased techniques which try to preserve the distance, our approach tries to finda constant vector field on the manifold and then reconstruct the embeddingfunction via the obtained vector field. When we restrict the vector field tothe gradient field, our approach is equivalent to finding killing vector fieldon manifold. Our analysis of killing field on Euclidean space shows, when themanifold is locally isometric to a connected subset of Euclidean space, we canalways recover the manifold isometrically. I will also present someexperimental results on both synthetic and real data sets.
Bio:何曉飛,博士,浙江大學(xué)教授、博導(dǎo),國(guó)家杰出青年基金獲得者,IEEE高級(jí)會(huì)員。2000年畢業(yè)于浙江大學(xué),獲計(jì)算機(jī)學(xué)士學(xué)位;2005年畢業(yè)于美國(guó)芝加哥大學(xué),獲計(jì)算機(jī)博士學(xué)位;之后加入美國(guó)雅虎公司,任職研究科學(xué)家;2007年作為人才引進(jìn)加入浙江大學(xué),任職教授;曾獲1999年國(guó)際大學(xué)生數(shù)學(xué)建模競(jìng)賽特等獎(jiǎng)。近年來主要從事人工智能、互聯(lián)網(wǎng)數(shù)據(jù)挖掘及計(jì)算機(jī)視覺等方面的研究。論文共被他人引用8000余次,其中兩篇代表性論文分別被他人引用上千次,F(xiàn)/曾任7個(gè)國(guó)際SCI學(xué)術(shù)刊物的編委,包括IEEE TKDE、IEEE TCYB、CVIU等。曾近30次擔(dān)任國(guó)際會(huì)議的大會(huì)主席、副主席及程序委員會(huì)委員。獲得2012年人工智能頂級(jí)國(guó)際會(huì)議AAAI的最佳論文獎(jiǎng),以及2010年多媒體領(lǐng)域國(guó)際頂級(jí)會(huì)議ACM Multimedia的最佳論文提名獎(jiǎng)。