GAMES Webinar 2019 – 121期（三维视觉前沿专题报告） |陈伟凯(Tencent America),祁芮中台(Waymo LLC)
【GAMES Webinar 2019-121期】（三维视觉前沿专题报告）
报告题目：Differentiable Rendering for Mesh and Implicit Field
Rendering bridges the gap between 2D vision and 3D scenes by simulating the physical process of image formation. By inverting such renderer, one can think of a learning approach to infer 3D information only from 2D images. However, the standard rendering approaches for mesh and implicit field are not differentiable due to the discrete operations involved. In this talk, I will introduce our latest differentiable rendering techniques for mesh and implicit field, which were accepted to ICCV’19 and NeurIPS’19 respectively. I will show how the rendering process for both representations can be implemented using differentiable functions. This talk will also demonstrate that the proposed approaches can accomplish a couple of challenging tasks, such as 3D unsupervised single-view reconstruction, occluded rigid and non-rigid object pose estimation, that was not possible using conventional technologies.
Weikai Chen is currently a Senior Research Scientist at Tencent America. Before that, he was a postdoc and then a researcher at Vision and Graphics Lab (VGL)@USC ICT working with Prof. Hao Li. He got his Ph.D. degree from the Department of Computer Science, The University of Hong Kong, under the supervision of Prof. Wenping Wang. His research lies in the interplay among computer graphics, computer vision, and deep learning. In particular, his current research focuses on image-based 3D reasoning, including 3D reconstruction of human (face/hair/body), general objects and scenes, and differentiable rendering. He has published over 15 papers on computer graphics and computer vision, most of them in top-tier venues such as SIGGRAPH/SIGGRAPH Asia/CVPR/ICCV/ECCV/NeurIPS/UIST.
报告题目：Deep Hough Voting for 3D Object Detection in Point Clouds
Current 3D object detection methods are heavily influenced by 2D detectors. In order to leverage architectures in 2D detectors, they often convert 3D point clouds to regular grids (i.e., to voxel grids or to bird’s eye view images), or rely on detection in 2D images to propose 3D boxes. Few works have attempted to directly detect objects in point clouds. In this work, we return to first principles to construct a 3D detection pipeline for point cloud data and as generic as possible. However, due to the sparse nature of the data — samples from 2D manifolds in 3D space — we face a major challenge when directly predicting bounding box parameters from scene points: a 3D object centroid can be far from any surface point thus hard to regress accurately in one step. To address the challenge, we propose VoteNet, an end-to-end 3D object detection network based on a synergy of deep point set networks and Hough voting. Our model achieves state-of-the-art 3D detection on two large datasets of real 3D scans, ScanNet and SUN RGB-D with a simple design, compact model size and high efficiency. Remarkably, VoteNet outperforms previous methods by using purely geometric information without relying on color images.
Charles Qi is currently a senior research scientist at Waymo LLC (previously Google’s self-driving car team). Before that he was a postdoctoral researcher at Facebook AI Research (FAIR). He got his Ph.D. from Stanford University in 2018 and his B.Eng. from Tsinghua University in 2013. His research focuses on deep learning, computer vision and 3D with well known publications in CVPR, ICCV, SIGGRAPH Asia, and NIPS (more than 4,000 citations according to Google Scholar). Some of the 3D deep learning and 3D object detection models he developed have been widely adopted in both academia and industry. He is also an advocator for reproducible research — all of his projects are open sourced, which have got over 5,000 stars in GitHub. More information can be found on his homepage.
GAMES主页的“使用教程”中有 “如何观看GAMES Webinar直播？”及“如何加入GAMES微信群？”的信息；