Toward Universal Speech Enhancement for Diverse Input Conditions. (arXiv:2309.17384v1 [eess.AS])

Click here to flash read.

The past decade has witnessed substantial growth of data-driven speech
enhancement (SE) techniques thanks to deep learning. While existing approaches
have shown impressive performance in some common datasets, most of them are
designed only for a single condition (e.g., single-channel, multi-channel, or a
fixed sampling frequency) or only consider a single task (e.g., denoising or
dereverberation). Currently, there is no universal SE approach that can
effectively handle diverse input conditions with a single model. In this paper,
we make the first attempt to investigate this line of research. First, we
devise a single SE model that is independent of microphone channels, signal
lengths, and sampling frequencies. Second, we design a universal SE benchmark
by combining existing public corpora with multiple conditions. Our experiments
on a wide range of datasets show that the proposed single model can
successfully handle diverse conditions with strong performance.

Click here to read this post out

ID: 441642; Unique Viewers: 0

Unique Voters: 0

Total Votes: 0

Votes:

Latest Change: Oct. 2, 2023, 7:32 a.m. Changes:

/u/anonymous

Dictionaries:

Words:

Spaces:

CC:
No creative common's license

Comments: