Conformer & convergence #169

danpovey · 2021-04-21T07:33:03Z

Got this via email from @zhu-han ...

I got the first reasonable result:
2021-04-21 02:28:00,831 INFO [common.py:365] [test-clean] %WER 13.39% [7041 / 52576, 887 ins, 965 del, 5189 sub ]
2021-04-21 02:29:42,636 INFO [common.py:365] [test-other] %WER 35.57% [18619 / 52343, 1327 ins, 3534 del, 13758 sub ]
The training log is  in the attachment. 
And code of this version is in https://github.com/zhu-han/snowfall/commit/4d4a0c42c175571e396736c757ceb6698afc9b18 
The differences with the original version in the paper are:
# this version vs original version
1) 4× subsampling vs 8× subsampling; 
2) kernel size 3 vs kernel size 5;
The model could not converge without these two changes. 
But the performance gap between this and conformer is still large. 
Do you have some advice on that?

For convergence problems: we are working on a way to make tuings converge much easier.. @csukuangfj was going to commit it. It involves using a simpler model as an alignment model, adding its output with the model being trained with a scale less than one, near the start of training.

The text was updated successfully, but these errors were encountered:

csukuangfj mentioned this issue Apr 22, 2021

Improve training speed by pre-computing compose(ctc_topo, P, L) #172

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conformer & convergence #169

Conformer & convergence #169

danpovey commented Apr 21, 2021 •

edited by csukuangfj

Loading

Conformer & convergence #169

Conformer & convergence #169

Comments

danpovey commented Apr 21, 2021 • edited by csukuangfj Loading

danpovey commented Apr 21, 2021 •

edited by csukuangfj

Loading