Skip to content

Is there any paper of document about the theory detail of the model #188

Answered by snakers4
qinyuenlp asked this question in Q&A
Discussion options

You must be logged in to vote

Hi,

Is there any paper or document can help me learn the theory detail of the model ?

There is no paper, but there is a short article - https://thegradient.pub/one-voice-detector-to-rule-them-all/

When I use SlidingWindow method to feed audio data into your model, different window-size cause different VAD result. I want to know how the self._h and self._c in OnnxWrapper change exactly.

The results should be slightly different for different windows.
Please be careful and read the docstring in the utils.

silero-vad/utils_vad.py

Lines 119 to 171 in ea7af70

def get_speech_timestamps(audio: torch.Tensor,
model,
threshold: float = 0.5,
sampling_rate: int = 16000,

Replies: 1 comment 4 replies

Comment options

You must be logged in to vote
4 replies
@snakers4
Comment options

@qinyuenlp
Comment options

@snakers4
Comment options

@qinyuenlp
Comment options

Answer selected by snakers4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
help wanted Extra attention is needed
2 participants
Converted from issue

This discussion was converted from issue #187 on May 11, 2022 16:21.