GAMES Webinar 2023 – 274期(AIGC如何照进三维世界) | 高俊(多伦多大学/英伟达)，王腾飞(香港科技大学)，刘若石(哥伦比亚大学)，刘圳(蒙特利尔大学/Mila/马普所)
【GAMES Webinar 2023-274期】(视觉专题-AIGC如何照进三维世界，Talk+Panel形式)
报告题目：Machine Learning for 3D Content Creation
Jun Gao is a PhD student at the University of Toronto advised by Prof. Sanja Fidler. He also holds the position of Research Scientist at NVIDIA Toronto AI lab. His research interests focus on the intersection of 3D computer vision and computer graphics, particularly developing machine learning tools to facilitate large-scale 3D content creation and drive real-world applications. His work has been presented at prestigious conferences such as NeurIPS, CVPR, ICCV, ECCV, ICLR and SIGGRAPH. Many of his contributions have been implemented in products, including NVIDIA Picasso, GANVerse3D, Neural DriveSim and Toronto Annotation Suite. He will serve as an Area Chair at NeurIPS 2023.
报告题目：RODIN: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion
Deep generative models have revolutionized the realm of 2D visual design; however, the development of high-quality three-dimensional generative models remains a challenge. In this talk, we will present RODIN, which is a 3D diffusion model to generate subjects represented by neural radiance fields. RODIN can efficiently generates 360-degree freely viewable 3D avatars and supports multi-modal inputs, such as images and texts, to produce personalized outcomes. This approach can enhance the efficiency of traditional digital avatar modeling process and has potential to be further applied for general 3D objects generation.
Tengfei Wang is a PhD student at the Hong Kong University of Science and Technology, supervised by Prof. Qifeng Chen. His research interests focus on generative modeling and 3D rendering, particularly on 3D generative models. Some of his works have been published in some computer vision venues such as CVPR and ICCV as Highlight or Oral presentation.
报告题目：Zero-1-to-3: Zero-shot One Image to 3D Object
We introduce Zero-1-to-3, a framework for changing the camera viewpoint of an object given just a single RGB image. To perform novel view synthesis in this under-constrained setting, we capitalize on the geometric priors that large-scale diffusion models learn about natural images. Our conditional diffusion model uses a synthetic dataset to learn controls of the relative camera viewpoint, which allow new images to be generated of the same object under a specified camera transformation. Even though it is trained on a synthetic dataset, our model retains a strong zero-shot generalization ability to out-of-distribution datasets as well as in-the-wild images, including impressionist paintings. Our viewpoint-conditioned diffusion approach can further be used for the task of 3D reconstruction from a single image. Qualitative and quantitative experiments show that our method significantly outperforms state-of-the-art single-view 3D reconstruction and novel view synthesis models by leveraging Internet-scale pre-training.
Ruoshi Liu is a second-year PhD student at Columbia University, advised by Carl Vondrick. He has broad interests in computer vision and deep learning such as video representation learning, 3D reconstruction, differentiable rendering, and recently large-scale generative models. He has experience working in various industry and academic labs such as Snap Research, Sony R&D, CERN, MRSEC. He loves movies, hiking, and cats.
报告题目：MeshDiffusion: Score-based Generative 3D Mesh Modeling
Our visual world is made of numerous and diverse 3D shapes, and to create a large-scale 3D virtual world requires an efficient approach to synthesize these shapes. Among possible choices of representations for generation, 3D meshes are favored since they are well optimized for efficient and controllable rendering and editing in modern graphics pipelines. It is tempting to generate high-quality meshes with diffusion models, which recently prove powerful in 2D image and video generation, but they are not able to directly handle topology-varying meshes. In this talk, I will share our efforts in building the first diffusion model for directly generating 3D meshes. Specifically, our method, dubbed MeshDiffusion, is able to perform unconditional and conditional generation of topology-varying 3D meshes with sharp geometric details by leveraging a structured parametrization of meshes. Our work sheds lights on how to apply diffusion models to general 3D representations.
Zhen Liu is a PhD candicate at Mila and University of Montreal, advised by Yoshua Bengio and Liam Paull. His research interests include novel representations and probabilistic modeling methods for 3D reconstruction and generation as well as other general domains. He published papers at top venues including NeurIPS, ICLR, ICML and CVPR. Currently, he is a visiting student at Max Planck Institute for Intelligent Systems, working with Bernhard Schölkopf and Michael J. Black.
GAMES主页的“使用教程”中有 “如何观看GAMES Webinar直播？”及“如何加入GAMES微信群？”的信息；