The 3rd Workshop on

AI for 3D Content Creation

08:00-12:30, 315 - October 20th Half Day @ICCV2025

Honolulu, Hawai'i

Want to present your work at this workshop? Check this link

Developing algorithms capable of generating realistic, high quality 3D content at scale has been a long standing problem in Computer Vision and Graphics. We anticipate that having generative models that can reliably synthesize meaningful 3D content will completely revolutionize the workflow of artists and content creators, and will also enable new levels of creativity through "generative art". Although recently there has been considerable success in generating photorealistic images, the quality and generality of 3D generative models has lagged behind their 2D counterparts. Additionally, efficiently controlling what needs to be generated and scaling these approaches to complex scenes with several static and dynamic objects still remains an open challenge.

In this workshop, we seek to bring together researchers working on generative models for 3D humans, objects, and scenes to discuss the latest advances and next steps toward developing generative pipelines capable of producing fully controllable 3D environments with multiple humans interacting with each other or objects in the scene. In summary, the expected covered topics are:

  • Representations: What is the most appropriate representation for generating high-quality 3D assets? What is the best representation that can enable intuitive control over the generated objects? How can we effectively represent interactions between humans and objects?
  • Modelling: How can we build a foundation model that is capable of generating diverse, high-quality and photo-realistic humans/objects/scenes? How can we ensure expressiveness that faithfully captures subtle details and nuances corresponding to the semantics of diverse input conditions? How can we maintain robustness under varied real-world conditions (e.g., lighting, hard pose)?
  • Interaction: How can we construct powerful models that can reliably generate human/objects performing plausible real-life motion with complex interactions? How far are we from the world model that could allow us to manipulate both the appearance of the scene elements as well as their spatial composition? How can we incorporate common sense knowledge about 3D objects and scenes, such as part structures, and arrangements of objects from Large Foundation Models or physics simulators to enable training with fewer data?
  • Applications: Are there new fields that could benefit from generated 3D content, such as embodied AI, construction and agriculture? How can we leverage 2D priors to enable photo-realistic 3D content creation? How can we build tools that meet designers' real need to make their workflow more efficient (e.g., interactive editing, asset rigging)?

News
  • October 16, 2025: Workshop's schedule released.
  • June 30, 2025: Workshop website launched, with the tentative list of the invited speakers announced.
Schedule

The workshop will take place on October 20th at Room 315 at Hawai'i Convention Center.

08:00 – 08:10 Opening Remarks
08:10 – 08:50 Keynote: Hao Su
Title: Breaking it Down and Building it Up: Parts, Articulations, and Compositional 3D
08:50 – 09:30 Keynote: Andrea Vedaldi
Title: Compositional 3D content creation
09:30 – 10:10 Keynote: Varun Jampani
Title: Crafting Video Diffusion: Precise Controls and Rich Outputs
10:10 – 11:00 Invited Poster Session & Coffee Break
11:00 – 11:40 Keynote: Philipp Henzler
Title: Generative 3D Content Creation
11:40 – 12:20 Keynote: Angela Dai
Title: Generating 3D in a Large-Data World
12:20 – 12:30 Closing Remarks
Speakers
Hao Su
Hao Su
UCSD

Hao Su is an Associate Professor of Computer Science at the University of California, San Diego (UCSD), and also serves as the Founder and Chief Technology Officer of Hillbot, an intelligent robotics startup. At UCSD, he is the Director of the Embodied Intelligence Laboratory, a founding member of the Halıcıoğlu Data Science Institute, and a member of the Center for Visual Computing and the Contextual Robotics Institute. His research focuses on developing algorithms to simulate, understand, and interact with the physical world. His interests span computer vision, machine learning, computer graphics, and robotics, with extensive publications and teaching experience in these fields.

Andrea Vedaldi
Andrea Vedaldi
Oxford

Andrea Vedaldi is Professor of Computer Vision and Machine Learning at the University of Oxford, where he co-leads the Visual Geometry Group since 2012. He is also a research scientist and technical lead at Meta. Andrea is a Fellow of the Royal Academy of Engineering and a Royal Society’s Faraday Discovery Fellow. He researches generative AI in computer vision, applied to the generation of 3D content from text and images and to image understanding. He is the author of more than 240 peer-reviewed publications in computer vision and machine learning. He is the recipient of the IEEE Thomas Huang Memorial Prize, the IEEE Mark Everingham Prize, and the Test of Time Award by the ACM, and two best paper awards from the Conference on Computer Vision and Pattern Recognition. He is the recipient of the ERC Starting and Consolidator Grants and co-I in two EPSRC Programme Grants.

Varun Jampani
Varun Jampani
Arcade AI

Varun Jampani is Chief AI Officer at Arcade.AI. Previously, he was VP research at Stability AI and also held researcher positions at Google and Nvidia. He works in the areas of machine learning and computer vision and his main research interests include image, video and 3D generation. He obtained his PhD with highest honors at Max Planck Institute for Intelligent Systems (MPI) and University of Tübingen in Germany. He obtained his BTech and MS from International Institute of Information Technology, Hyderabad (IIIT-H), India, where he was a gold medalist. He actively contributes to the research community and regularly serves as area chair and reviewer for major computer vision and machine learning conferences. His works have received ‘Best Paper Honorable Mention’ award at CVPR’18 and 'Best Student Paper Honorable Mention' award at CVPR’23.

Philipp Henzler
Philipp Henzler
Google

Philipp Henzler is a Research Scientist at Google working on controllable video models (Veo) and generative 3D AI. He received his PhD from University College London supervised by Tobias Ritschel and Niloy J. Mitra. His PhD thesis received the Eurographics PhD Thesis award 2024. He received his BSc and MSc from Ulm University.

Angela Dai
Angela Dai
TUM

Angela Dai is an Associate Professor at the Technical University of Munich where she leads the 3D AI Lab. Angela's research focuses on understanding how real-world 3D scenes around us can be modeled and semantically understood. Previously, she received her PhD in computer science from Stanford in 2018, advised by Pat Hanrahan, and her BSE in computer science from Princeton in 2013. Her research has been recognized through an ECVA Young Researcher Award, ERC Starting Grant, Eurographics Young Researcher Award, German Pattern Recognition Award, Google Research Scholar Award, and an ACM SIGGRAPH Outstanding Doctoral Dissertation Honorable Mention.

Organizers
Hezhen Hu
UT Austin
Georgios Pavlakos
UT Austin
Despoina Paschalidou
NVIDIA
Nikos Kolotouros
Google
Davis Rempe
NVIDIA
Angel Xuan Chang
Simon Fraser University
Kai Wang
Amazon
Amlan Kar
University of Toronto and NVIDIA
Kaichun Mo
NVIDIA Research
Daniel Ritchie
Brown University
Leonidas Guibas
Stanford