Jiayuan Mao - Neuro-Symbolic Frameworks for Visual Concept Learning and Language Acquisition
03/19/2020
Link to slides: https://drive.google.com/open?id=1vqV22iywgws8e6hGU3bbc621cftrixJf
Humans are capable of learning visual concepts by jointly understanding vision and language. Imagine that someone with no prior knowledge of colors is presented with the images of the red and green objects, paired with descriptions. They can easily identify the difference in objects’ visual appearance (in this case, color), and align it to the corresponding words. This intuition motivates the use of image-text pairs to facilitate automated visual concept learning and language acquisition.
In the talk, I will present recent progress on neuro-symbolic models for visual concept learning, reasoning, and language acquisition. These models learn visual concepts and their association with symbolic representations of language and unravel syntactic structures, as well as compositional semantics of sentences, only by looking at images and reading paired natural language texts. No explicit supervision, such as class labels for objects or parse trees, is needed. I will also discuss their extensions to syntactic bootstrapping, metaconcept reasoning, action grounding, and robotic planning.
Jiayuan Mao is a Ph.D. student at MIT, advised by Professors Josh Tenenbaum and Leslie Kaelbling. Mao's research focuses on structured knowledge representations that can be transferred among tasks and inductive biases that improve the learning efficiency and generalization. Representative research topics are concept learning, neuro-symbolic reasoning, scene understanding, language acquisition, and robotic planning.
Видео Jiayuan Mao - Neuro-Symbolic Frameworks for Visual Concept Learning and Language Acquisition канала Vision & Graphics Seminar at MIT
Link to slides: https://drive.google.com/open?id=1vqV22iywgws8e6hGU3bbc621cftrixJf
Humans are capable of learning visual concepts by jointly understanding vision and language. Imagine that someone with no prior knowledge of colors is presented with the images of the red and green objects, paired with descriptions. They can easily identify the difference in objects’ visual appearance (in this case, color), and align it to the corresponding words. This intuition motivates the use of image-text pairs to facilitate automated visual concept learning and language acquisition.
In the talk, I will present recent progress on neuro-symbolic models for visual concept learning, reasoning, and language acquisition. These models learn visual concepts and their association with symbolic representations of language and unravel syntactic structures, as well as compositional semantics of sentences, only by looking at images and reading paired natural language texts. No explicit supervision, such as class labels for objects or parse trees, is needed. I will also discuss their extensions to syntactic bootstrapping, metaconcept reasoning, action grounding, and robotic planning.
Jiayuan Mao is a Ph.D. student at MIT, advised by Professors Josh Tenenbaum and Leslie Kaelbling. Mao's research focuses on structured knowledge representations that can be transferred among tasks and inductive biases that improve the learning efficiency and generalization. Representative research topics are concept learning, neuro-symbolic reasoning, scene understanding, language acquisition, and robotic planning.
Видео Jiayuan Mao - Neuro-Symbolic Frameworks for Visual Concept Learning and Language Acquisition канала Vision & Graphics Seminar at MIT
Показать
Комментарии отсутствуют
Информация о видео
20 марта 2020 г. 3:28:49
01:05:29
Другие видео канала
Joao Carreira - More general perceptionMengye Ren - Towards Continual and Compositional Few-shot LearningAndrew Ilyas - Statistical Bias in Dataset ReplicationDavid Acuna - Overcoming Data Scarcity in Deep LearningHsueh-Ti Derek Liu - Towards Scalable Geometry ProcessingAmlan Kar - Learning to Create and Label DataChen-Hsuan Lin - Learning 3D Registration and Reconstruction from the Visual WorldRohit Girdhar - Challenges in Video Understanding: Supervision, Biases and Temporal ReasoningOlga Russakovsky - Fairness in visual recognitionAyush Tewari - Self-Supervised 3D Digitization of FacesBharath Hariharan - Visual Learning with fewer labelsShubham Tulsiani - Self-supervised Reconstruction and InteractionKrishna Murthy - Building differentiable models of the 3D worldDídac Surís - Learning the Predictability of the FutureChao-Yuan Wu - On Efficiently Modeling VideosKaterina Fragkiadaki - 3D Vision with 3D View-Predictive Neural Scene representationsXinshuo Weng - A Paradigm Shift for Perception and Prediction Pipeline in Autonomous DrivingChris Yu - Repulsive CurvesYuval Bahat - Explorable Super ResolutionSilvia Sellán - A deep dive into Swept VolumesTaesung Park - Machine Learning for Deep Image Manipulation