GAMES Webinar 2020 – 168期(视觉与成像专题) | Yiyi Liao (MPI for Intelligent Systems and University of Tübingen), 罗璇 (University of Washington)

by yuanqing · 2020年12月21日

【GAMES Webinar 2020-168期】(视觉与成像专题)

报告嘉宾1：Yiyi Liao(MPI for Intelligent Systems and University of Tübingen)

报告时间：2020年12月24号星期四上午10:00-10:45（北京时间）

报告题目：3D Controllable Image Synthesis

报告摘要：

In recent years, Generative Adversarial Networks have achieved impressive results in photorealistic image synthesis. This progress nurtures hopes that one day the classical rendering pipeline can be replaced by efficient models that are learned directly from images. However, current image synthesis models operate in the 2D domain where disentangling 3D properties such as camera viewpoint or object pose is challenging. Furthermore, they lack an interpretable and controllable representation. In this talk, I’ll present our recent progress towards 3D-aware image synthesis. Our methods are built on the key hypothesis that the image generation process should be modeled in 3D space as the physical world surrounding us is intrinsically three-dimensional. We explore different 3D representations in the image generation process, including meshes, point clouds and continuous radiance fields. We show our methods allow for 3D-aware image synthesis while learning from unposed 2D images only. In addition, I’ll introduce KITTI-360, a recently released large-scale dataset with comprehensive 2D & 3D annotations which we hope to foster research in 3D-aware image synthesis as well as other important research areas.

讲者简介：

Yiyi Liao is a postdoctoral researcher in the Autonomous Vision Group at the University of Tübingen and Max Planck Institute for Intelligent Systems, working with Prof. Andreas Geiger. She received her Ph.D. from Zhejiang University in 2018 and her B.S. degree from Xi’an Jiaotong University in 2013. Her research interests include 3D scene understanding, reconstruction and 3D-aware generative models.

讲者个人主页: https://yiyiliao.github.io

报告嘉宾2：罗璇 (University of Washington)

报告时间：2020年12月24号星期四上午10:45-11:30（北京时间）

报告题目：Consistent Video Depth Estimation

报告摘要：

We present an algorithm for reconstructing dense, geometrically consistent depth for all pixels in a monocular video. We leverage a conventional structure-from-motion reconstruction to establish geometric constraints on pixels in the video. Unlike the ad-hoc priors in classical reconstruction, we use a learning-based prior, i.e., a convolutional neural network trained for single-image depth estimation. At test time, we fine-tune this network to satisfy the geometric constraints of a particular input video, while retaining its ability to synthesize plausible depth details in parts of the video that are less constrained. We show through quantitative validation that our method achieves higher accuracy and a higher degree of geometric consistency than previous monocular reconstruction methods. Visually, our results appear more stable. Our algorithm is able to handle challenging hand-held captured input videos with a moderate degree of dynamic motion. The improved quality of the reconstruction enables several applications, such as scene reconstruction and advanced video-based visual effects.

讲者简介：

I am a PhD student in the UW Reality Lab at University of Washington CSE, working with Steven Seitz, Jason Lawrence and Ricardo Martin Brualla. I am interested in combining virtual/augmented reality with computer vision and graphics to create interesting surreal experience. My current research focuses on 3D vision and image synthesis. During the PhD, I’ve been fortunate enough to spend great summers at Google, Disney Research Zurich and Facebook. Prior to UW, I received B.S. from Shanghai Jiao Tong University working with Hongtao Lu, and visited National Univerity of Singapore working with Shuicheng Yan.

讲者个人主页: http://roxanneluo.github.io/

主持人简介：

崔兆鹏，浙江大学计算机学院CAD&CG国家重点实验室“百人计划”研究员、博士生导师。2017年在加拿大西蒙弗雷泽大学（Simon Fraser University）获得博士学位。2017年至2020年在瑞士苏黎世联邦理工学院（ETH Zurich）计算机视觉和几何实验室（CVG）任高级研究员。研究方向为三维计算机视觉，主要从事基于视觉信息的三维感知和理解，具体包括三维重建、运动恢复结构、多视角立体几何、三维场景理解、同时定位与地图构建、视频图像编辑等。近年来在计算机视觉、机器人、计算机图形学、机器学习等领域顶级期刊和会议（CVPR、ICCV、SIGGRAPH、NeurIPS、ICRA）上发表论文20余篇，曾获ICRA 2020 Best Paper Finalist in Robot Vision。主持人个人主页：http://www.cad.zju.edu.cn/home/zhpcui/

GAMES主页的“使用教程”中有 “如何观看GAMES Webinar直播？”及“如何加入GAMES微信群？”的信息；
GAMES主页的“资源分享”有往届的直播讲座的视频及PPT等。
观看直播的链接：http://webinar.games-cn.org