GAMES Webinar 2022 – 244期(视觉生成) | 张博(微软亚洲研究院),王耀晖(上海人工智能实验室)

【GAMES Webinar 2022-244期】(视觉专题-视觉生成)



报告题目:Towards High-fidelity Generative Modeling: From 2D Image Generation to 3D Character Rendering


Deep neural networks not only help us to understand but can also be creative. We have seen rapid advancement of deep generative networks which can now produce astoundingly realistic output. Now we are reaching a time when the deep generative modeling is undergoing radical paradigm shifts. In this talk, I will showcase the recent research trends by presenting a series of works in our group. I will first show how the architectural revolution — adopting a transformer backbone — constitutes a state-of-the-art generative adversarial network (GAN). Then I will introduce using the emerging diffusion models for a variety of image synthesis tasks. Finally, I brief on the recent progress of neural character rendering, especially 3D avatar generation at Microsoft and how these techniques improve the immersive experience in various scenarios.


Bo Zhang is currently a senior researcher at Microsoft Research Asia (MSRA). Prior to that, he received his Ph.D. degree with the Department of Electronic and Computer Engineering at Hong Kong University of Science and Technology (HKUST) in 2019. His research interest mainly involves deep generative models, computational photography, face modeling, and photo-realistic avatar generation. For more information, please visit his webpage at:




报告题目:Learning to Animate Images via Latent Space Navigation


Due to the remarkable progress of deep generative models, animating images has become increasingly efficient, whereas associated results have become increasingly realistic. Current animation-approaches commonly exploit structure representation extracted from driving videos. In this talk, I will introduce a novel structure-free approach, the Latent Image Animator (LIA), a self-supervised autoencoder that evades need for structure representation. LIA is streamlined to animate images by linear navigation in the latent space. Specifically, motion in generated video is constructed by linear displacement of codes in the latent space. Towards this, we learn a set of orthogonal motion directions simultaneously, and use their linear combination, in order to represent any displacement in the latent space.


Yaohui Wang is a Research Scientist at Shanghai AI Laboratory. He received his Ph.D. degree from Inria Sophia Antipolis in 2021, advised by Dr. Francois Bremond and Dr. Antitza Dantcheva. His research focuses on deep generative models for video synthesis and image animation. He has published papers on top computer vision and machine learning venues such as CVPR, ECCV and ICLR. He has been nominated 2022 Inria-UCA PhD thesis Award.



易冉,博士,上海交通大学计算机科学与工程系助理教授。2016年获得清华大学工学学士学位,2021年获得清华大学工学博士学位。入选上海市2022年度“科技创新行动计划”启明星项目扬帆专项。从事计算机图形学、计算机视觉和计算几何等方面的研究。近五年共发表录用30余篇论文于IEEE TPAMI、ACM TOG、SIGGRAPH、CVPR、ICCV、TVCG、AAAI等国际期刊和会议,其中CCF-A国际期刊会议18篇。获得2021年度中国图象图形学学会石青云女科学家奖(青英组)、中国计算机学会计算机视觉专委会学术新锐奖、第十六届图像图形技术与应用学术会议(IGTA2021)论文竞赛一等奖、北京市图象图形学学会优秀博士论文、清华大学优秀博士学位论文、微软学者提名奖等学术奖项。担任中国图象图形学学会智能图形专委会、动画与数字娱乐专委会委员,TPAMI、IJCV、TIP、CVPR、ICCV、NeurIPS、ICLR、AAAI等国际顶级期刊会议审稿人。

晏轶超,上海交通大学人工智能研究院助理教授。获上海交通大学电子工程系学士、博士学位,法国里昂中央理工学院硕士学位,曾担任阿联酋起源人工智能研究院研究科学家。主要研究方向为计算机视觉、图形学技术及其在虚拟现实、元宇宙、数字多媒体中的应用,发表包括TPAMI、CVPR、ACM MM在内的国际高水平论文20余篇,Google Scholar 引用超过1000次。曾担任人工智能领域国际顶级会议 AAAI 以及IJCAI程序委员会成员,长期担任TPAMI、IJCV、CVPR、ICCV等十余个国际顶级会议与期刊审稿专家。入选上海市海外高层次人才计划,获2020年度中国图象图形学学会优秀博士论文奖。


GAMES主页的“使用教程”中有 “如何观看GAMES Webinar直播?”及“如何加入GAMES微信群?”的信息;

You may also like...