- Популярные видео
- Авто
- Видео-блоги
- ДТП, аварии
- Для маленьких
- Еда, напитки
- Животные
- Закон и право
- Знаменитости
- Игры
- Искусство
- Комедии
- Красота, мода
- Кулинария, рецепты
- Люди
- Мото
- Музыка
- Мультфильмы
- Наука, технологии
- Новости
- Образование
- Политика
- Праздники
- Приколы
- Природа
- Происшествия
- Путешествия
- Развлечения
- Ржач
- Семья
- Сериалы
- Спорт
- Стиль жизни
- ТВ передачи
- Танцы
- Технологии
- Товары
- Ужасы
- Фильмы
- Шоу-бизнес
- Юмор
A Quantum Approach to Vision Language Modelling
Speaker: Mehrnoosh Sadrzadeh
Moderator: Ted Theodosopoulos
Abstract: Vision-language models excel at large-scale image-text alignment but often neglect the compositional structure of language, leading to failures on tasks that hinge on word order and predicate-argument structure. We show how techniques from tensor networks and variational quantum circuits help us solve the problem. To this end, we introduce two tools DisCoCLIP and QuCLIP, multimodal encoders that combine a frozen CLIP vision transformer with a tensor network text encoder that explicitly encodes syntactic structure. We also work with translations of syntax into variational quantum circuits. We train both models with a self-supervised contrastive loss and show how the models improve on compositional benchmarks such as SVO-Probes and ARO, while using a significantly smaller number of parameters. The parameter reduction is a known feature of tensor networks and variational quantum circuits, and for this case, was on average from hundreds of millions to tens of thousand.
Speaker's bio:
Mehrnoosh is a Professor of CS, leads UCL CS's Quantum Learning Labs, and is the CS Director of Research. Her research is supported by a Royal Academy of Engineering (RAEng) Research Chair, jointly with the BBC and Quantinuum Ltd. Mehrnoosh’s UG and MSc studies were in Sharif University in Iran. Her PhD in University of Quebec at Montreal. Previously, Mehrnoosh had two RAEng Industrial Fellowships, in QMUL and UCL, an EPSRC Career Acceleration Fellowship in Oxford, an EPSRS PDRF and a Wolfson College Junior Research Fellowship, also at Oxford.
Moderator's bio:
Ted is a mathematician who, after working for years in academia and industry, transitioned to teaching at the pre-college level seventeen years ago, the last nine at Nueva, where he teaches math and economics. Ted’s research background is in the area of interacting stochastic systems, with particular applications in biology and economics.
Видео A Quantum Approach to Vision Language Modelling канала Relatorium
Moderator: Ted Theodosopoulos
Abstract: Vision-language models excel at large-scale image-text alignment but often neglect the compositional structure of language, leading to failures on tasks that hinge on word order and predicate-argument structure. We show how techniques from tensor networks and variational quantum circuits help us solve the problem. To this end, we introduce two tools DisCoCLIP and QuCLIP, multimodal encoders that combine a frozen CLIP vision transformer with a tensor network text encoder that explicitly encodes syntactic structure. We also work with translations of syntax into variational quantum circuits. We train both models with a self-supervised contrastive loss and show how the models improve on compositional benchmarks such as SVO-Probes and ARO, while using a significantly smaller number of parameters. The parameter reduction is a known feature of tensor networks and variational quantum circuits, and for this case, was on average from hundreds of millions to tens of thousand.
Speaker's bio:
Mehrnoosh is a Professor of CS, leads UCL CS's Quantum Learning Labs, and is the CS Director of Research. Her research is supported by a Royal Academy of Engineering (RAEng) Research Chair, jointly with the BBC and Quantinuum Ltd. Mehrnoosh’s UG and MSc studies were in Sharif University in Iran. Her PhD in University of Quebec at Montreal. Previously, Mehrnoosh had two RAEng Industrial Fellowships, in QMUL and UCL, an EPSRC Career Acceleration Fellowship in Oxford, an EPSRS PDRF and a Wolfson College Junior Research Fellowship, also at Oxford.
Moderator's bio:
Ted is a mathematician who, after working for years in academia and industry, transitioned to teaching at the pre-college level seventeen years ago, the last nine at Nueva, where he teaches math and economics. Ted’s research background is in the area of interacting stochastic systems, with particular applications in biology and economics.
Видео A Quantum Approach to Vision Language Modelling канала Relatorium
Комментарии отсутствуют
Информация о видео
11 ч. 26 мин. назад
01:08:31
Другие видео канала




















