Загрузка...

[slides] JSALT 2025 - Plenary Talk - B. Schuppler: Cross-layer Models for Low-Res Conversational ASR

📍 Live from FIT, Brno University of Technology (Czech Republic), room E112
🕘 July 1st, 2025 — 11:00 CEST
🎙️ Barbara Schuppler (TU Graz, Austria)

In recent years, conversational speech has become a major focus in speech science and technology. As dialogue systems evolve from transactional tools into socially interactive agents, they demand increasingly accurate automatic speech recognition (ASR). At the same time, conversational data offers unique insights into human speech processing. Drawing on the cross-layer optimization principle from communications engineering, I adopt a similar view of how meaning is accessed across multiple levels of speech information. In this talk, I present findings from my group’s work on integrating pronunciation and prosodic variation into ASR for conversational speech. Our hybrid approach—combining data-driven and knowledge-based methods—proves especially effective in low-resource settings. While transformer-based models often outperform classical systems, the latter still excel with short, fragmented utterances when paired with linguistic knowledge. Beyond ASR, our methods inform fields like pathological speech analysis, dementia prediction, and assistive speech technologies.

Bio: Barbara Schuppler studied Physics and Spanish Philology at the University of Graz and the Universidad Autónoma de Madrid, completing a diploma thesis in experimental physics in 2007. She conducted her dissertation within the Marie-Curie RTN "Sound-to-Sense" at Radboud University Nijmegen, with research visits at NTNU Trondheim. After working as teacher at the Graz International Bilingual School, she was awarded an FWF Hertha-Firnberg Grant in 2012 and joined the Signal Processing and Speech Communication Laboratory at TU Graz. Now Associate Professor at TU Graz, her research interests include the investigation of methods for quantitative analyses of prosody and pronunciation variation in conversational speech, the integration of gained phonetic and linguistic knowledge into speech technology, with a specific focus on applications in the educational and healthcare sector.

Видео [slides] JSALT 2025 - Plenary Talk - B. Schuppler: Cross-layer Models for Low-Res Conversational ASR канала Center for Language & Speech Processing(CLSP), JHU
Яндекс.Метрика
Все заметки Новая заметка Страницу в заметки
Страницу в закладки Мои закладки
На информационно-развлекательном портале SALDA.WS применяются cookie-файлы. Нажимая кнопку Принять, вы подтверждаете свое согласие на их использование.
О CookiesНапомнить позжеПринять