Scaling unlocks emergent abilities in language models
Date Presented: 02/02/2023
Speaker: Jason Wei, Google Brain
Abstract:
Scaling up language models has been shown to predictably improve performance on a wide range of downstream tasks. In this talk, we will instead discuss an unpredictable phenomenon that we refer to as emergent abilities of large language models. An ability is considered emergent if it is not present in smaller models but is present in larger models, which means that the ability cannot be predicted simply by extrapolating the performance of smaller models. With the popularization of large language models such as GPT-3, Chinchilla, and PaLM, dozens of emergent abilities have been discovered, including chain-of-thought prompting, which enables state-of-the-art mathematical reasoning, and instruction finetuning, which enables large language models to be usable by the broader population. The existence of such emergent phenomena raises the question of whether additional scaling could potentially further expand the range of capabilities of language models.
Speaker's Bio:
Jason Wei is a senior research scientist at Google Brain. His research focuses on large language models, and his research includes chain-of-thought prompting, instruction finetuning, and emergent abilities of language models. His work has been featured in more than five Google AI blog posts, and chain-of-thought prompting was presented last year at Google I/O 2022 by Google CEO Sundar Picchai. Before Google, Jason received his AB from Dartmouth College in Hanover, New Hampshire.
Видео Scaling unlocks emergent abilities in language models канала USC Information Sciences Institute
Speaker: Jason Wei, Google Brain
Abstract:
Scaling up language models has been shown to predictably improve performance on a wide range of downstream tasks. In this talk, we will instead discuss an unpredictable phenomenon that we refer to as emergent abilities of large language models. An ability is considered emergent if it is not present in smaller models but is present in larger models, which means that the ability cannot be predicted simply by extrapolating the performance of smaller models. With the popularization of large language models such as GPT-3, Chinchilla, and PaLM, dozens of emergent abilities have been discovered, including chain-of-thought prompting, which enables state-of-the-art mathematical reasoning, and instruction finetuning, which enables large language models to be usable by the broader population. The existence of such emergent phenomena raises the question of whether additional scaling could potentially further expand the range of capabilities of language models.
Speaker's Bio:
Jason Wei is a senior research scientist at Google Brain. His research focuses on large language models, and his research includes chain-of-thought prompting, instruction finetuning, and emergent abilities of language models. His work has been featured in more than five Google AI blog posts, and chain-of-thought prompting was presented last year at Google I/O 2022 by Google CEO Sundar Picchai. Before Google, Jason received his AB from Dartmouth College in Hanover, New Hampshire.
Видео Scaling unlocks emergent abilities in language models канала USC Information Sciences Institute
Комментарии отсутствуют
Информация о видео
3 февраля 2023 г. 2:33:50
01:00:12
Другие видео канала