Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition. (arXiv:2305.05084v1 [eess.AS])

Click here to flash read.

Conformer-based models have become the most dominant end- to-end architecture
for speech processing tasks. In this work, we propose a carefully redesigned
Conformer with a new down- sampling schema. The proposed model, named Fast Con-
former, is 2.8x faster than original Conformer, while preserv- ing
state-of-the-art accuracy on Automatic Speech Recognition benchmarks. Also we
replace the original Conformer global attention with limited context attention
post-training to enable transcription of an hour-long audio. We further improve
long- form speech transcription by adding a global token. Fast Con- former
combined with a Transformer decoder also outperforms the original Conformer in
accuracy and in speed for Speech Translation and Spoken Language Understanding.

Click here to read this post out

ID: 117635; Unique Viewers: 0

Unique Voters: 0

Total Votes: 0

Votes:

Latest Change: May 10, 2023, 7:30 a.m. Changes:

/u/anonymous

Dictionaries:

Words:

Spaces:

CC:
No creative common's license

Comments: