Spring til indhold
Forside

Nyhed

Speech separation project wins Best Paper Award by IEEE

Lagt online: 18.01.2023

From a selection of thousands of papers, the IEEE Signal Processing society has awarded the Best Paper Award to Morten Kolbæk, Dong Yu, Zheng-Hua Tan and Jesper Jensen

Nyhed

Speech separation project wins Best Paper Award by IEEE

Lagt online: 18.01.2023

From a selection of thousands of papers, the IEEE Signal Processing society has awarded the Best Paper Award to Morten Kolbæk, Dong Yu, Zheng-Hua Tan and Jesper Jensen

Separating jumbled voices has been a conundrum for several years but the now award-winning-paper has provided a substantial and impactful contribution to solving said conundrum.

The paper from 2017 holds significant value, stands the test of time and as such the excellent research has now been recognized by the largest professional community in the field.

“I am very happy to have contributed to expanding knowledge in the field. When I was Ph.D.-student I was wondering if I would ever receive such an accolade. Standing here now, I am humbled by the honor. It is a tremendous feeling to have helped society and fellow engineers with our research,” says Morten Kolbæk, who now works as a machine learning engineer at Whisper.ai.

Scalability and simplicity are central

In the paper the researchers propose an utterance-level permutation invariant training (uPIT) technique that – in simple terms - can filter out jumbled and mixed voices into separate audio tracks for the respective voices.

In essence the algorithm can separate several simultaneous voices into distinct channels. For example, it would be able to record different voices in a conversation at a conference and then output each voice to a respective channel. This can be extremely useful in many situations, for example to extract a talker-of-interest in hearing aid systems, editing and help improve software for online meetings, etc.

“It is always very exciting to contribute to expanding knowledge in a field. It is the pinnacle of what what we strive to do, and I am proud and happy that we’re not alone in recognizing the importance of this paper,” says Jesper Jensen, co-author and professor at Department of Electronic Systems, AAU.

One of the advantages of the ‘model’ is the scalability – which means that the AI can be ‘trained’ with large datasets to improve its ability to represent complex listening situations. A feat not easily achieved.

“It’s a great honor and I’m very happy to have contributed to that. At the time it felt like an uphill battle, but once we succeeded it felt like an immense victory. A lot of effort went into it, and I’m glad that the IEEE has recognized our work,” says Zheng-Hua Tan, co-author and professor at Department of Electronic Systems, AAU.

The collected works have since been cited over 1200 times with the award-winning paper sitting at 649 at the time of writing. It has since been used as a standard comparison in several cases as well as having paved the way for more advanced technologies.

You can read more about the project here.