- Популярные видео
- Авто
- Видео-блоги
- ДТП, аварии
- Для маленьких
- Еда, напитки
- Животные
- Закон и право
- Знаменитости
- Игры
- Искусство
- Комедии
- Красота, мода
- Кулинария, рецепты
- Люди
- Мото
- Музыка
- Мультфильмы
- Наука, технологии
- Новости
- Образование
- Политика
- Праздники
- Приколы
- Природа
- Происшествия
- Путешествия
- Развлечения
- Ржач
- Семья
- Сериалы
- Спорт
- Стиль жизни
- ТВ передачи
- Танцы
- Технологии
- Товары
- Ужасы
- Фильмы
- Шоу-бизнес
- Юмор
Ashmal Vayani - Seeing the World as It Speaks Multilingual, Culturally Aware Multimodal AI
For all the promise of “global” multimodal AI, most systems still speak to a narrow slice of the world. This talk traces a path to genuinely multilingual, culturally grounded models through three steps: first, ALM-Bench reframes evaluation by testing 100 languages across culturally situated tasks, revealing where today’s LMMs falter, especially on low-resource scripts. Next, ViMUL moves beyond images to video, pairing a diverse 14-language, 15-domain benchmark with a balanced baseline to show how training and evaluation can align for robust multilingual video understanding. Finally, we examine language itself as a causal factor: a cross-lingual T2I study where grammatical gender shifts visual outputs, surfacing a new axis of bias. Together, these pieces offer a story and a blueprint for inclusive, reliable multimodal systems.
I am an MSc. student in the College of Engineering and Computer Science department at the University of Central Florida. I am a member of the Center for Research in Computer Vision (CRCV) Lab advised by Prof. Mubarak Shah.
Previously, I was a Research Engineer in the Computer Vision Department, affiliated with the IVAL-Lab at Mohamed bin Zayed University of Artificial Intelligence (MBZUAI).
This session is brought to you by the Cohere Labs Open Science Community - a space where ML researchers, engineers, linguists, social scientists, and lifelong learners connect and collaborate with each other. We'd like to extend a special thank you to Ahmad Anis and Kanwal Mehreen, Lead of our Geo Regional Asia group for their dedication in organizing this event.
If you’re interested in sharing your work, we welcome you to join us! Simply fill out the form at https://forms.gle/ALND9i6KouEEpCnz6 to express your interest in becoming a speaker.
Join the Cohere Labs Open Science Community to see a full list of upcoming events (https://tinyurl.com/CohereLabsCommunityApp).
Видео Ashmal Vayani - Seeing the World as It Speaks Multilingual, Culturally Aware Multimodal AI канала Cohere
I am an MSc. student in the College of Engineering and Computer Science department at the University of Central Florida. I am a member of the Center for Research in Computer Vision (CRCV) Lab advised by Prof. Mubarak Shah.
Previously, I was a Research Engineer in the Computer Vision Department, affiliated with the IVAL-Lab at Mohamed bin Zayed University of Artificial Intelligence (MBZUAI).
This session is brought to you by the Cohere Labs Open Science Community - a space where ML researchers, engineers, linguists, social scientists, and lifelong learners connect and collaborate with each other. We'd like to extend a special thank you to Ahmad Anis and Kanwal Mehreen, Lead of our Geo Regional Asia group for their dedication in organizing this event.
If you’re interested in sharing your work, we welcome you to join us! Simply fill out the form at https://forms.gle/ALND9i6KouEEpCnz6 to express your interest in becoming a speaker.
Join the Cohere Labs Open Science Community to see a full list of upcoming events (https://tinyurl.com/CohereLabsCommunityApp).
Видео Ashmal Vayani - Seeing the World as It Speaks Multilingual, Culturally Aware Multimodal AI канала Cohere
Комментарии отсутствуют
Информация о видео
18 октября 2025 г. 21:51:31
00:41:17
Другие видео канала




















