Institute of Automation, Chinese Academy of Sciences; University of Chinese Academy of Sciences
demo.1.mp4This repo contains official implementations of TC-Light, a one-shot model used to manipulate the illumination of high-dynamic videos such as motion-rich actions and frequent switch of foreground and background objects. It is distinguished by:
- 🔥 Outstanding Temporal Consistency on Highly Dynamic Scenarios.
- 🔥 Superior Computational Efficiency that Enables Long Video Processing (can process 300 frames with resolution of 1280x720 on 40G A100).
These features make it particularly suitable for sim2real and real2real augmentation for Embodied Agents or preparing video pairs to train stronger video relighting models. Star ⭐ us if you like it!
TC-Light overview. Given the source video and text prompt p, the model tokenizes input latents in xy plane and yt plane seperately. The predicted noises are combined together for denoising. Its output then undergoes two-stage optimization. The first stage aligns exposure by optimizing appearance embedding. The second stage aligns detailed texture and illumination by optimizing Unique Video Tensor, which is compressed version of video Please refer to the paper for more details.
Install the required environment as follows:
Then download required model weights to ./models from the following links:
- Hugging Face: https://huggingface.co/TeslaYang123/TC-Light
- Baidu Netdisk: https://pan.baidu.com/s/1L-mk6Ilzd2o7KLAc7-gIHQ?pwd=rj99
As a quick start, you can use:
By default, it will relight the first 30 frames with resolution 960x720. The default negative prompt is adopted from Cosmos-Transfer1, which makes the edited illumination as real as possible. If it is the first-time running on a specific video, it would generate and save flow un the path to your video.
For a fine-grained control, you can customize your .yaml config file and run:
You can start from configs/tclight_custom.yaml, which records the most frequently used parameters and detailed explanation.
ExamplesFor evaluation, you can simply use:
- Works better on video resolution over 512x512, which is the minimum resolution used to train IC-Light. A higher resolution helps consistency of image intrinsic properties.
- Works relatively better on realistic scenes than synthetics scenes, no matter in temporal consistency or physical plausibility.
- Stuggle to drastically change illumination of night scenarios or hard shadows, as done in IC-Light.
- Release the arXiv and the project page.
- Release the code base.
- Release the dataset.
If you find this repository useful for your research, please use the following BibTeX entry for citation.
This repo benefits from IC-Light, VidToMe, Slicedit, RAVE, Cosmos. Thanks for their great work! The repo is still under development, we are open to pull request and discussions!
.png)



![Educating the Next Generation of Open Source Project Contributors [video]](https://www.youtube.com/img/desktop/supported_browsers/chrome.png)