Загрузка страницы

Jane Dwivedi Yu - Teaching Language models to use tools | ML in PL 23

Large language models (LLMs) have fueled dramatic progress in natural language tasks and are already at the core of many user-facing products, such as ChatGPT and Copilot. Paradoxically, language models often still struggle with basic tasks, like solving simple arithmetic problems, where smaller and simpler external resources, such as a calculator, can accomplish the task perfectly. This talk will focus on LLMs that leverage external resources, beginning with models that are always prompted to use an external tool, like retrieval-augmented models. The second part of this talk will concentrate on teaching models to autonomously understand how and when to leverage tools in a self-supervised way. Finally, we will discuss exciting new opportunities that necessitate external tool usage.

Jane Dwivedi-Yu is a researcher at Meta AI. Her current research focuses on enhancing capabilities of language models along several dimensions, including tool usage, editing, and evaluating representation harms and notions of morality and norms internalized by these models. She is also interested in building large-scale personalized recommender systems by leveraging principles from affective computing, work which was cited among the top 15 AI papers to read in 2022. Before joining Meta, she completed her PhD in Computer Science at University of California, Berkeley and Bachelors at Cornell University.

The talk was delivered during ML in PL Conference 2023 as a part of Contributed Talks. The conference was organized by a non-profit NGO called ML in PL Association.

ML in PL Association website: https://mlinpl.org/
ML In PL Conference 2023 website: https://conference2023.mlinpl.org/
ML In PL Conference 2024 website: https://conference.mlinpl.org/
---

ML in PL Association was founded based on the experiences in organizing of the ML in PL Conference (formerly PL in ML), the ML in PL Association is a non-profit organization devoted to fostering the machine learning community in Poland and Europe and promoting a deep understanding of ML methods. Even though ML in PL is based in Poland, it seeks to provide opportunities for international cooperation.

Видео Jane Dwivedi Yu - Teaching Language models to use tools | ML in PL 23 канала ML in PL
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
7 мая 2024 г. 21:00:34
00:44:51
Другие видео канала
Rafał Pilarczyk: Is Artificial Intelligence a threat to musicians? – Music generation techniquesRafał Pilarczyk: Is Artificial Intelligence a threat to musicians? – Music generation techniquesB. Ludwiczuk, K. Jasinska-Kobus (Allegro) - Batch construction strategies in deep metric learningB. Ludwiczuk, K. Jasinska-Kobus (Allegro) - Batch construction strategies in deep metric learningMarcin Andrychowicz - Solving Rubik’s Cube with a Robot HandMarcin Andrychowicz - Solving Rubik’s Cube with a Robot HandAdam Paszke: PyTorch 1.0: now and in the futureAdam Paszke: PyTorch 1.0: now and in the futureAdam Podraza: Applied time series forecasting using machine learningAdam Podraza: Applied time series forecasting using machine learningGül Varol - Learning human body representations from visual dataGül Varol - Learning human body representations from visual dataDavid Haber - Opportunities and Challenges when Building AI for Autonomous FlightDavid Haber - Opportunities and Challenges when Building AI for Autonomous FlightAdam Gonczarek (Alphamoon) – Intelligent Document ProcessingAdam Gonczarek (Alphamoon) – Intelligent Document ProcessingJonasz Pamuła (RTB House) – ML Challenges in cookieless worldJonasz Pamuła (RTB House) – ML Challenges in cookieless worldJoão Henriques - Mapping environments with deep networks and spatial memoriesJoão Henriques - Mapping environments with deep networks and spatial memoriesKrzysztof Geras (NYU): "Towards Solving Breast Cancer Screening Diagnosis with Deep Learning"Krzysztof Geras (NYU): "Towards Solving Breast Cancer Screening Diagnosis with Deep Learning"Stanisław Jastrzębski - Deep Learning in the Light of the Simplicity Bias | MLSS Kraków 2023Stanisław Jastrzębski - Deep Learning in the Light of the Simplicity Bias | MLSS Kraków 2023How to learn classifier chains using positive-unlabelled multi-label data? | ML in PL 22How to learn classifier chains using positive-unlabelled multi-label data? | ML in PL 22Yoshua Bengio – Cognitively-inspired inductive biases for higher-level cognitionYoshua Bengio – Cognitively-inspired inductive biases for higher-level cognitionTomasz Grel (Nvidia): Faster Deep Learning with mixed precision and multiple GPUsTomasz Grel (Nvidia): Faster Deep Learning with mixed precision and multiple GPUsPanel Discussion – Women in MLPanel Discussion – Women in MLMichał Jamroż - Class fitting in residual convolutional networks | ML in PL 23Michał Jamroż - Class fitting in residual convolutional networks | ML in PL 23Sebastian Cygert - Toward continually learning models | ML in PL 23Sebastian Cygert - Toward continually learning models | ML in PL 23Barbara Rychalska - Neural Machine Translation: achievements, challenges and the way forwardBarbara Rychalska - Neural Machine Translation: achievements, challenges and the way forwardStanisław Jastrzębski - Gradient Alignment: When Deep Networks Work, and When They Don'tStanisław Jastrzębski - Gradient Alignment: When Deep Networks Work, and When They Don'tAgnieszka Grabska-Barwińska - Neuroscience-inspired analysis of machine learningAgnieszka Grabska-Barwińska - Neuroscience-inspired analysis of machine learning
Яндекс.Метрика