Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty Recordings File in utils/fix_data_dir.sh Script #4920

Open
pri1712 opened this issue Jun 25, 2024 · 0 comments
Open

Empty Recordings File in utils/fix_data_dir.sh Script #4920

pri1712 opened this issue Jun 25, 2024 · 0 comments
Labels

Comments

@pri1712
Copy link

pri1712 commented Jun 25, 2024

I am using the WeSpeaker pipeline and Kaldi toolkit for a speaker diarization task, employing ResNet as the feature extractor. During the filtering of my segments file using the script utils/fix_data_dir.sh, I ran into an issue where the script filters my segments file to zero lines due to the temporary file /tmp/kaldi.XXXX/recordings having no entries

The following is the link to the script : https://github.com/kaldi-asr/kaldi/blob/master/egs/wsj/s5/utils/fix_data_dir.sh .

This is a part of the error output:
utils/fix_data_dir.sh: filtered /data1/XYZ/ABC/speaker_diarization/SHARC_check/tools_wespk/data/ABC_dev_fbank_seg/old_dir/segments from 8310 to 0 lines based on filter /tmp/kaldi.3oKA/recordings.
I found that the file /tmp/kaldi.XXXX/recordings generated by the script is empty which causes the script to filter out all lines from the segments file.

  1. What might be causing the /tmp/kaldi.XXXX/recordings file to be empty?

  2. Are there any known issues or additional steps required to ensure the recordings file is correctly populated?

If required I can provide the formats of the necessary files to check for any formatting errors between segments and wav.scp which is being used to generate the /tmp/kaldi.XXXX/recordings file

Thanks

@pri1712 pri1712 added the bug label Jun 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant