LangAI セミナー #3 開催 “Scaling Multilingual Speech Recognition: From a Handful to Thousands of Languages”, Dr. Shinji Watanabe, Carnegie Mellon University.

2025.07.01

Dr. Shinji Watanabe, Carnegie Mellon University をお招きして、セミナーを開催いたします。

参加希望の方は、下記Googleフォームにて申し込みください。申込は東北大学学内限定です。(tohoku.ac.jpの学内アドレスよりお申し込みください)

We are pleased to invite Dr.Shinji Watanabe(Carnegie Mellon University) to join us for a seminar.If you would like to attend, please register using the Google Form below.

参加申込/Apply

開催日時/Date	2025年7月15日 (火) 16:30～17:30
講演題目/Title	Scaling Multilingual Speech Recognition: From a Handful to Thousands of Languages
現地場所/Location	東北大学川内キャンパスマルチメディア教育研究棟(A05) 2F　M206講義室【マルチメディア教育研究棟】 https://www.tohoku.ac.jp/japanese/profile/campus/01/kawauchi/areaa.html 地図中のA05の建物．▲が建物入り口【マルチメディアホール各階案内】 https://www2.he.tohoku.ac.jp/center/mm_intro/mm_intro.html
対象者/Target Audience	学内の研究者、学生、関係者（東北大学学内限定） Researchers, students, and related persons on campus (Tohoku University campus only)
備考/Note	※当日現地での参加登録も可能です。 *You can also register on-site on the day of the event.

Title:

Scaling Multilingual Speech Recognition: From a Handful to Thousands of Languages

Abstract:

This presentation outlines our research journey in advancing multilingual speech recognition. Our first end-to-end multilingual ASR system, developed in 2017, supported just 10 languages. By leveraging paired speech and transcription data, we later scaled the approach to cover approximately 100 languages. To facilitate broader research, we introduced Multilingual SUPERB, a benchmark built on these languages. However, scaling ASR to encompass all 7,000+ languages worldwide remains a major challenge due to the lack of such paired data for most languages. To address this gap, the ASR2K project proposed a universal phone-based ASR model, integrating lexicons and language models—marking the first step toward recognizing speech across thousands of languages. More recently, self-supervised learning (SSL) approaches have made it possible to incorporate additional languages, at least in the pre-training phase. Despite these advances, data imbalance and bias remain persistent challenges. In this talk, we present our latest work on scaling model sizes—up to 18 billion parameters—as a strategy to mitigate such biases. Although full coverage of thousands of languages is still out of reach, we hope this talk will spark further efforts in the community toward addressing this critical and long-standing problem.

Speaker :

Dr. Shinji Watanabe, Associate Professor at Carnegie Mellon University

Short bio:

Shinji Watanabe is an Associate Professor at Carnegie Mellon University, Pittsburgh, PA. He received his B.S., M.S., and Ph.D. (Dr. Eng.) degrees from Waseda University, Tokyo, Japan. He was a research scientist at NTT Communication Science Laboratories, Kyoto, Japan, from 2001 to 2011, a visiting scholar at Georgia Institute of Technology, Atlanta, GA, in 2009, and a senior principal research scientist at Mitsubishi Electric Research Laboratories (MERL), Cambridge, MA USA from 2012 to 2017. Before Carnegie Mellon University, he was an associate research professor at Johns Hopkins University, Baltimore, MD, USA, from 2017 to 2020. His research interests include automatic speech recognition, speech enhancement, spoken language understanding, and machine learning for speech and language processing. He has published over 500 papers in peer-reviewed journals and conferences and received several awards, including the best paper award from ISCA Interspeech in 2024. He is a Senior Area Editor of the IEEE Transactions on Audio Speech and Language Processing. He was/has been a member of several technical committees, including the APSIPA Speech, Language, and Audio Technical Committee (SLA), IEEE Signal Processing Society Speech and Language Technical Committee (SLTC), and Machine Learning for Signal Processing Technical Committee (MLSP). He is an IEEE and ISCA Fellow.

LangAI セミナー #2 開催「What was Watson? – the grand challenge in question answering, and its impact on the AI」Hiroshi Kanayama (金山博) Ph.D. , IBM Research – Tokyo

DIVISION研究部門

基盤研究部門
Basic Research Division

領域適応研究部門
Industry Adaptation Research Division

AI共生社会研究部門
Society and AI Research Division

社会共創部門
Social Co-creation Division

LangAI セミナー #3 開催 “Scaling Multilingual Speech Recognition: From a Handful to Thousands of Languages”, Dr. Shinji Watanabe, Carnegie Mellon University.

LangAI セミナー #2 開催「What was Watson? – the grand challenge in question answering, and its impact on the AI」Hiroshi Kanayama (金山博) Ph.D. , IBM Research – Tokyo

LangAI セミナー開催「Generative AI and Applications」Dr. Simon See(Global Head, NVIDIA AI Technology Center)

言語AI研究センターKickOff アーカイブ画像を公開 / Archived YouTube videos are available

言語AI研究センターキックオフシンポジウム[2024.3.7開催]　Kick-off Symposium – Center for Language AI Research [March 7, 2024].

第4回AI王クイズ日本一決定戦を開催

LangAI セミナー #3 開催 “Scaling Multilingual Speech Recognition: From a Handful to Thousands of Languages”, Dr. Shinji Watanabe, Carnegie Mellon University.

Title:

Abstract:

Speaker :

Short bio:

関連記事

LangAI セミナー開催「Generative AI and Applications」Dr. Simon See(Global Head, NVIDIA AI Technology Center)

LangAI セミナー #2 開催「What was Watson? – the grand challenge in question answering, and its impact on the AI」Hiroshi Kanayama (金山博) Ph.D. , IBM Research – Tokyo

Fully Connected 2023 Tokyoカンファレンス講演

LangAI セミナー #3 開催 “Scaling Multilingual Speech Recognition: From a Handful to Thousands of Languages”, Dr. Shinji Watanabe, Carnegie Mellon University.

Title:

Abstract:

Speaker :

Short bio:

関連記事

LangAI セミナー開催 「Generative AI and Applications」Dr. Simon See(Global Head, NVIDIA AI Technology Center)

LangAI セミナー #2 開催 「What was Watson? – the grand challenge in question answering, and its impact on the AI」Hiroshi Kanayama (金山 博) Ph.D. , IBM Research – Tokyo

Fully Connected 2023 Tokyoカンファレンス 講演

LangAI セミナー開催「Generative AI and Applications」Dr. Simon See(Global Head, NVIDIA AI Technology Center)

LangAI セミナー #2 開催「What was Watson? – the grand challenge in question answering, and its impact on the AI」Hiroshi Kanayama (金山博) Ph.D. , IBM Research – Tokyo

Fully Connected 2023 Tokyoカンファレンス講演