On Parameter Efficiency of Neural Language Models
Abstract: Pre-trained neural language models have demonstrated remarkable generalizability in various downstream tasks, such as natural language understanding and question answering. However, these models have grown to contain hundreds of billions of parameters, making them difficult to be deployed in applications with latency requirements and memory constraints. Furthermore, existing research have demonstrated the existence of significant redundant parameters in neural language models. Such redundancy can further compromise their downstream generalizability. To tackle these challenges, my research focus on training neural language models towards higher parameter efficiency and better model generalizability. In this talk, I will introduce three major directions of my research: 1) designing model training algorithms for better parameter utilization, 2) developing pruning and distillation strategies for reliable model compression, and 3) improving cross-task generalizability of parameter-efficient fine-tuning methods.
Bio: Chen Liang is a final-year PhD student from Georgia Institute of Technology, working with Prof. Tuo Zhao in the FLASH research group. Her research interests broadly lie in deep learning and natural language processing, with a major focus on developing methodologies and algorithms to improve model generalizability and parameter efficiency for pre-trained neural language models. Before starting her PhD, Chen received her BS degree in Electrical Engineering from University of Southern California.
link: https://cliang1453.github.io/
Видео On Parameter Efficiency of Neural Language Models канала Allen Institute for AI
Bio: Chen Liang is a final-year PhD student from Georgia Institute of Technology, working with Prof. Tuo Zhao in the FLASH research group. Her research interests broadly lie in deep learning and natural language processing, with a major focus on developing methodologies and algorithms to improve model generalizability and parameter efficiency for pre-trained neural language models. Before starting her PhD, Chen received her BS degree in Electrical Engineering from University of Southern California.
link: https://cliang1453.github.io/
Видео On Parameter Efficiency of Neural Language Models канала Allen Institute for AI
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Text Modular Networks: Learning to Decompose Tasks in the Language of Existing ModelsVisual ReactionHoracio Saggion: Mining and Enriching Scientific Text CollectionsAjay Nagesh: Exploring Relational Features and LearningLearning for Never-before-seen BiomedicineJust-DREAM-about-it: Figurative Language Understanding with DREAM-FLUTEAna Marasovic: "Resolving Abstract Anaphors in Discourse..."Rishabh Iyer: Submodular Optimization and Data Summarization with Applications to Computer VisionKevin Gimpel: From Paraphrase Modeling to Controlled GenerationApplied AI in High-Expertise Settings, or Curation as ProgrammingFrom 'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project | AI2When Not to Trust Language Models: Investigating Effectiveness of Parametric&Non-Parametric MemoriesRoozbeh Mottaghi: Toward Scene UnderstandingVisual Room Rearrangement (CVPR 2021)Adapting to Long Tail Domains: A Case Study in Clinical Information | AI2Kenneth D. Forbus: Multimodal Science LearningAbhisek Das Talk: "Towards Agents that can See, Talk, and Act"Towards Cost-Efficient Use of Pre-trained ModelsCross-Task Generalization via Natural Language Crowdsourcing InstructionsDaniel Khashabi - Natural Language Understanding with Indirect SupervisionJesse Dodge: Open Loop Hyperparameter Optimization and Determinantal Point Processes