SCOTCH and SODA: A Transformer Video Shadow Detection Framework

Lihao Liu1
Jean Prost2
Lei Zhu3,4
Nicolas Papadakis2
Pietro Liò1
Carola-Bibiane Schönlieb1
Angelica I Aviles-Rivero1

1University of Cambridge
     2Universite de Bordeaux
3HKUST (GZ)
4HKUST




Abstract

Shadows in videos are difficult to detect because of the large shadow deformation between frames. In this work, we argue that accounting for the shadow deformation is essential when designing a video shadow detection method. To this end, we introduce the shadow deformation attention trajectory (SODA), a new type of video self-attention module, specially designed to handle the large shadow deformations in videos. Moreover, we present a shadow contrastive learning mechanism (SCOTCH), which aims at guiding the network to learn a high-level representation of shadows, unified across different videos. We demonstrate empirically the effectiveness of our two contributions in an ablation study. Furthermore, we show that SCOTCH and SODA significantly outperforms existing techniques for video shadow detection.



Overview of Network Architecture




Deformation Attention Trajectory




Video Shadow Detection Results




Paper and Code

Lihao Liu, Jean Prost, Lei Zhu, Nicolas Papadakis, Pietro Liò,
Carola-Bibiane Schönlieb, and Angelica I Aviles-Rivero.


SCOTCH and SODA: A Transformer Video Shadow Detection Framework.

Computer Vision and Pattern Recognition (CVPR), 2023.

[arxiv] [bibtex] [code]



Acknowledgments

LL gratefully acknowledges the financial support from a GSK scholarship and a Girton College Graduate Research Fellowship at the University of Cambridge. AIAR acknowledges support from CMIH and CCIMI, University of Cambridge. CBS acknowledges support from the Philip Leverhulme Prize, the Royal Society Wolfson Fellowship, the EPSRC advanced career fellowship EP/V029428/1, EPSRC grants EP/S026045/1, EP/T003553/1, EP/N014588/1, EP/T017961/1, the Wellcome Innovator Awards 215733/Z/19/Z and 221633/Z/20/Z, the European Union Horizon 2020 research and innovation programme under the Marie Skodowska-Curie grant agreement No. 777826 NoMADS, the Cantab Capital Institute for the Mathematics of Information and the Alan Turing Institute. JP and NP acknowledge the supports from the EU Horizon 2020 research and innovation programme NoMADS (Marie Skłodowska-Curie grant agreement No 777826).