GCC icon

Generative Color Constancy via Diffusing a Color Checker

arXiv 2025

1National Yang Ming Chiao Tung University, 2MediaTek
Teaser

Abstract

Color constancy methods often struggle to generalize across different camera sensors due to varying spectral sensitivities. We present GCC, which leverages diffusion models to inpaint color checkers into images for illumination estimation. Our key innovations include (1) a single-step deterministic inference approach that inpaints color checkers reflecting scene illumination, (2) a Laplacian decomposition technique that preserves checker structure while allowing illumination-dependent color adaptation, and (3) a mask-based data augmentation strategy for handling imprecise color checker annotations. GCC demonstrates superior robustness in cross-camera scenarios, achieving state-of-the-art worst-25% error rates of 5.22° and 4.32° in bi-directional evaluations. These results highlight our method's stability and generalization capability across different camera characteristics without requiring sensor-specific training, making it a versatile solution for real-world applications.

Approach

Training

Pipeline

Starting from pretrained stable-diffusion-2-inpainting, we enable color checker generation through end-to-end fine-tuning. Given a ground truth color checker image and its mask, we apply color jittering in the masked region. The input image latent passes through Laplacian composition before being concatenated with the masked image latent and the resized mask for the SD Inpainting U-Net. The model is trained with an L2 loss between the inpainted output and ground truth image at a fixed timestep T.

Inference

Pipeline

A neutral color checker is pasted onto the input image, which is then encoded into the latent space. The input latent is processed through Laplacian composition before being concatenated with the masked image latent and the resized mask. The modified U-Net generates an inpainted result at fixed timestep T. After inverse gamma correction,we sample the color checker patches to obtain the final RGB illumination value. We highlight the steps and components that are different from the training pipeline.

Color checker misalignment issue

Unseen

Analysis of color checker alignment strategies. (a) Direct inpainting on masked regions leads to poor color checker structure. This is because we do not provide any guidance on the desired color checker structure, causing the model to generate contours that do not meet our expectations. (b) Using a homography transform to overlay a template suffers from pixel-level misalignment due to imprecise bounding box annotations. (c) Our mask color jittering approach overcomes corner point annotation limitations by allowing the model generate geometrically consistent color checker structures while accurately reflecting scene illumination.

Laplacian Decomposition

Unseen

We introduce Laplacian decomposition to address a key challenge in color checker generation: maintaining structural consistency while adapting to scene illumination. This technique extracts high-frequency components from the input image, preserving the color checker's patch layout while suppressing the original color patterns from our pre-pasted neutral color checker. This approach prevents the model from simply reconstructing the original colors, instead encouraging it to harmonize the generated color checker with the scene's illumination.

Qualitative results

Our method achieves consistent color correction performance across various in-the-wild scenes, demonstrating its practical robustness.

Visualization Results

These video demonstrates the robustness of our method across various color checker positions under a single illuminant scenario.


Citation