Shadows are not all the same. Cast shadows describe spatial relationships between objects and surfaces, while attached shadows reveal local shape and orientation directly on object surfaces. Most existing methods still treat shadows as a single category or overlook attached shadows, which limits both physical understanding and downstream use. Our goal is to explicitly separate these two shadow types because they provide complementary information for geometry reasoning and shadow removal.
Shadows encode rich information about scene geometry and illumination, yet existing methods either predict a unified shadow mask or overlook attached shadows entirely. We propose a framework for jointly detecting cast and attached shadows through explicit physical modeling of light direction and surface geometry. Our approach builds a closed feedback loop between shadow detection and light estimation: updated light estimates with surface normals produce partial attached shadow maps that guide detection, while improved shadow predictions sharpen light estimation. Experiments demonstrate that our physically grounded, iterative formulation outperforms prior methods, with at least a 33% reduction in attached BER, while maintaining strong full and cast performance.
Quantitative comparison with state-of-the-art methods on our dataset. We report BER↓ and F1↑ for full, cast, and attached shadows. Methods fine-tuned on our training set are marked with †. Best results are in bold.
Qualitative comparison of our method with retrained BDRAR, FSDNet, FDRNet, and SILT on our dataset. Predicted masks are overlaid on the input image (green: attached, red: cast). Our physics-grounded iterative framework produces more accurate attached-shadow boundaries.
We further test our model on video shadow datasets to examine cross-dataset generalization beyond our image-based benchmark. On SBU-TimeLapse, which mainly contains static scenes, and ViSha, which includes more dynamic scenes and motion, our method still produces meaningful cast and attached shadow separation without dataset-specific retraining.
To support training and evaluation, we introduce the first dataset curated for cast and attached shadow detection, with 1,458 images collected from three benchmarks. Each image includes a normal map, light direction, and cast, attached, and undefined shadow annotations.
@misc{hu2026castattachedshadowdetection,
title={Cast and Attached Shadow Detection via Iterative Light and Geometry Reasoning},
author={Shilin Hu and Jingyi Xu and Sagnik Das and Dimitris Samaras and Hieu Le},
year={2026},
eprint={2512.06179},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2512.06179},
}