A Real-Time Video Portrait Matting Method Based on Large Kernel Convolution

English | 中文

The GitHub repository for the paper A Real-Time Video Portrait Matting Method Based on Large-Kernel Convolution. LKM is designed for high-precision, real-time video matting of portraits. This paper proposes a video portrait matting network based on large-kernel convolution, achieving high-precision real-time video portrait matting through the use of large-kernel convolutions, wavelet transforms, and re-parameterization techniques. The network employs large-kernel convolutions to enhance the encoding and decoding capabilities of image features. Additionally, the wavelet transform is used to pass image information to the decoding network, providing more useful information. Finally, re-parameterization technology is employed to fuse multiple convolution branches with large kernels in parallel, effectively reducing computational burden and speeding up inference. The model can achieve processing speeds of 103 fps for HD video and 94 fps for 4K video on an Nvidia GTX 3090 GPU, enabling real-time video matting.

News

Environment Setup

Unzip the package and create a virtual environment:

unzip LKM-main.zip
cd LKM-main
conda create -n LKM python=3.7
conda activate LKM

Install the corresponding version of PyTorch for CUDA. Here we are using version CUDA 11.1:

pip install torch==1.10.1+cu111 torchvision==0.11.2+cu111 torchaudio==0.10.1 -f https://download.pytorch.org/whl/cu111/torch_stable.html

Install the necessary library files for using unireplknet:

pip install -U openmim
mim install mmcv-full==1.5.0
mim install mmsegmentation==0.27.0
pip install timm==0.6.11 mmdet==2.28.1

Install other library files:

pip install -r requirements.txt

Training and Evaluation

Please refer to the training documentation to train and evaluate your own model.

Demo

video:

python inference.py \
  --variant unireplknet \
  --checkpoint "./pretrained/model.pth" \
  --device cuda \
  --input-source "input.mp4" \
  --output-type video \
  --output-composition "out_com.mp4" \
  --output-alpha "out_pha.mp4" \
  --output-foreground "out_fgr.mp4" \
  --output-video-mbps 4 \
  --seq-chunk 12

wecab:

python inference_wecab.py

Speed

Speed is measured with inference_speed_test.py for reference.

GPU	dType	HD (1920x1080)	4K (3840x2160)
RTX 3090	FP16	103 FPS	94 FPS

Note 1: HD uses downsample_ratio=0.25, 4K uses downsample_ratio=0.125. All tests use batch size 1 and frame chunk 1.
Note 2: We refer to the method used by RVM for testing speed by measuring only tensor throughput. Directly using a camera for testing may reduce the speed, but it still maintains a fast processing capability, achieving real-time inference. The use of video hardware encoding and decoding technology could further improve the inference speed. For details, please refer to PyNvCodec.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
__pycache__		__pycache__
dataset		dataset
documentation		documentation
evaluation		evaluation
model		model
pretrained		pretrained
LICENSE		LICENSE
README.md		README.md
README_zh_Hans.md		README_zh_Hans.md
hubconf.py		hubconf.py
inference.py		inference.py
inference_speed_test.py		inference_speed_test.py
inference_utils.py		inference_utils.py
inference_wecab.py		inference_wecab.py
requirements.txt		requirements.txt
test.py		test.py
test_video.py		test_video.py
train.py		train.py
train_config.py		train_config.py
train_loss.py		train_loss.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Real-Time Video Portrait Matting Method Based on Large Kernel Convolution

News

Environment Setup

Training and Evaluation

Demo

Speed

About

Releases

Packages

Languages

License

lvhonglong/LKM

Folders and files

Latest commit

History

Repository files navigation

A Real-Time Video Portrait Matting Method Based on Large Kernel Convolution

News

Environment Setup

Training and Evaluation

Demo

Speed

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages