BiVM: Accurate Binarized Neural Network for Efficient Video Matting
METADATA ONLY
Loading...
Author / Producer
Date
2025-10
Publication Type
Journal Article
ETH Bibliography
yes
METADATA ONLY
Data
Rights / License
Abstract
Deep neural networks for real-time video matting suffer significant computational limitations on edge devices, hindering their adoption in widespread applications such as online conferences and short-form video production. Binarization emerges as one of the most common compression approaches with compact 1-bit parameters and efficient bitwise operations. However, accuracy and efficiency limitations exist in the binarized video matting network due to its degenerated encoder and redundant decoder. Following a theoretical analysis based on the information bottleneck principle, the limitations are mainly caused by the degradation of prediction-relevant information in the intermediate features and the redundant computation in prediction-irrelevant areas. We present BiVM, an accurate and resource-efficient Binarized neural network for Video Matting. First, we present a series of binarized computation structures with elastic shortcuts and evolvable topologies, enabling the constructed encoder backbone to extract high-quality representations from input videos for accurate prediction. Second, we sparse the intermediate feature of the binarized decoder by masking homogeneous parts, allowing the decoder to focus on representation with diverse details while alleviating the computation burden for efficient inference. Furthermore, we construct a localized binarization-aware mimicking framework with the information-guided strategy, prompting matting-related representation in fullprecision counterparts to be accurately and fully utilized. Comprehensive experiments show that the proposed BiVM surpasses alternative binarized video matting networks, including state-of-the-art (SOTA) binarization methods, by a substantial margin. Moreover, our BiVM achieves significant savings of 14.3x and 21.6x in computation and storage costs, respectively. We also evaluate BiVM on ARM CPU hardware.
Permanent link
Publication status
published
Editor
Book title
Volume
47 (10)
Pages / Article No.
9250 - 9265
Publisher
IEEE
Event
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Video matting; binarization; model compression; low-bit quantization
Organisational unit
01225 - D-ITET Zentr. f. projektbasiertes Lernen / D-ITET Center for Project-Based Learning
Notes
Funding
219943 - Neuromorphic Attention Models for Event Data (NAMED) (SNF)
