We introduce ROGR, a novel approach that reconstructs a relightable 3D model of an object captured from multiple views, driven by a generative relighting model that simulates the effects of placing the object under novel environment illuminations. Our method samples the appearance of the object under multiple lighting environments, creating a dataset that is used to train a lighting-conditioned Neural Radiance Field (NeRF) that outputs the object's appearance under any input enviromental lighting. The lighting-conditioned NeRF uses a novel dual-branch architecture to encode the general lighting effects and specularities separately.
The optimized lighting-conditioned NeRF enables efficient feed-forward relighting under arbitrary environment maps without requiring per-illumination optimization or light transport simulation. We evaluate our approach on the established TensoIR and Stanford-ORB datasets, where it improves upon the state-of-the-art on most metrics, showcase our approach on real-world object captures.
Our multi-view relighting diffusion model takes as input N posed images illuminated with a consistent, but unknown illumination, represented by the camera raymap and the source pixel values, and an environment map per image, that has been rotated to the camera pose. The diffusion model generates images of the same object from the same poses, but lit by input environment map. In order to generate our multi-illumination dataset, we repeat this relighting process M times with M different environment maps.
We use a combination of two lighting conditioning signals to train the NeRF on our generated multi-illumination dataset. The general lighting encoding fgeneral is used for encoding the full environment map in a single embedding and is obtained using a transformer encoder applied to the entire sphere of incident radiance. The specular encoding fspecular is composed of the environment map value, as well as prefiltered versions of the environment map, queried at the reflection direction ωr, which is the direction of the camera ray reflected about the surface normal vector. Combining these two conditioning signals provides the NeRF with all the information necessary for relighting diffuse materials as well as shiny ones that exhibit strong reflections.
Neural Gaffer 1-view diffusion
Our 16-view diffusion
Ground truth
General and Specular Conditioning
Only Specular Conditioning
Only General Conditioning
Check out the following works which also introduce a relighting diffusion model.
DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation (single image relighting with radiance cues)
Neural Gaffer: Relighting Any Object via Diffusion (single image relighting, needs to re-optimized for novel lighting. )
A Diffusion Approach to Radiance Field Relighting using Multi-Illumination Synthesis (single image relighting, needs to re-optimized for novel lighting)
IllumiNeRF: 3D Relighting Without Inverse Rendering (single image relighting with radiance cues, needs to re-optimized for novel lighting.)
RelitLRM: Generative Relightable Radiance for Large Reconstruction Models (directly generate relightable NeRF from sparse views and the target illumination, but does not guarantee consistent geometry across environment maps.)
DiffusionRenderer: Neural Inverse and Forward Rendering with Video Diffusion Models (video diffusion model for relighting but lacks 3D consistency and slow inference speed)
We would like to thank Xiaoming Zhao, Rundi Wu, Songyou Peng, Ruiqi Gao, Ben Poole, Aleksander Holynski, Jason Zhang, Jonathan T. Barron, Stan Szymanowicez, Hadi Alzayer, Alex Trevithick, and Jiahui Lei for their valuable contributions. We also extend our gratitude to Shlomi Fruchter, Kevin Murphy, Mohammad Babaeizadeh, Han Zhang and Amir Hertz for training the base text-to-image latent diffusion model.
@inproceedings{tang2025rogr,
        author    = {Jiapeng Tang and Matthew Levine and Dor Verbin and Stephan J. Garbin and Matthias Niessner and Ricardo Martin-Brualla and Pratul P. Srinivasan and Philipp Henzler},
        title     = {{ROGR: Relightable 3D Objects using Generative Relighting}},
        booktitle = {Advances in Neural Information Processing Systems},
        year      = {2025},
    }