Moral Self-Correction in Large Language Models | paper explained
We explain why large language models (LLM) suffer from multiple personality disorder and how they can morally self-correct with instructions. We elucidate technical terms such as RLHF (reinforcement learning from human feedback), explain "instruction following" and "Chain of Thought" prompting (CoT). "Moral Self-Correction in Large Language Models", paper explained.
► Sponsor: Salad 👉 https://bit.ly/SaladCloud-Letitia
Check out our #MachineLearning Quiz Questions: https://www.youtube.com/c/AICoffeeBreak/community
📜 Ganguli, Deep, Amanda Askell, Nicholas Schiefer, Thomas Liao, Kamilė Lukošiūtė, Anna Chen, Anna Goldie et al. "The capacity for moral self-correction in large language models." arXiv preprint arXiv:2302.07459 (2023). https://arxiv.org/abs/2302.07459
Read more about RLHF: https://huggingface.co/blog/rlhf 🤗
Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏
Dres. Trost GbR, Siltax, Edvard Grødem, Vignesh Valliappan, Mutual Information, Mike Ton,
Kshitij
Outline:
00:00 Large LMs are biased
01:48 Salad (Sponsor)
02:49 Question answering: Q
04:23 Language models have multiple personality disorder
05:27 RLHF explained
08:23 Instruction Following (IF) explained
09:37 CoT: Chain of Thought prompting explained
10:05 Effect of size on moral self-correction
13:07 Effect of RLHF on instruction following
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕
Patreon: https://www.patreon.com/AICoffeeBreak
Ko-fi: https://ko-fi.com/aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔗 Links:
AICoffeeBreakQuiz: https://www.youtube.com/c/AICoffeeBreak/community
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
YouTube: https://www.youtube.com/AICoffeeBreak
#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research
Music 🎵 : Illusions - Anno Domini Beats
Video editing: Nils Trost
Видео Moral Self-Correction in Large Language Models | paper explained канала AI Coffee Break with Letitia
► Sponsor: Salad 👉 https://bit.ly/SaladCloud-Letitia
Check out our #MachineLearning Quiz Questions: https://www.youtube.com/c/AICoffeeBreak/community
📜 Ganguli, Deep, Amanda Askell, Nicholas Schiefer, Thomas Liao, Kamilė Lukošiūtė, Anna Chen, Anna Goldie et al. "The capacity for moral self-correction in large language models." arXiv preprint arXiv:2302.07459 (2023). https://arxiv.org/abs/2302.07459
Read more about RLHF: https://huggingface.co/blog/rlhf 🤗
Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏
Dres. Trost GbR, Siltax, Edvard Grødem, Vignesh Valliappan, Mutual Information, Mike Ton,
Kshitij
Outline:
00:00 Large LMs are biased
01:48 Salad (Sponsor)
02:49 Question answering: Q
04:23 Language models have multiple personality disorder
05:27 RLHF explained
08:23 Instruction Following (IF) explained
09:37 CoT: Chain of Thought prompting explained
10:05 Effect of size on moral self-correction
13:07 Effect of RLHF on instruction following
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕
Patreon: https://www.patreon.com/AICoffeeBreak
Ko-fi: https://ko-fi.com/aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔗 Links:
AICoffeeBreakQuiz: https://www.youtube.com/c/AICoffeeBreak/community
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
YouTube: https://www.youtube.com/AICoffeeBreak
#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research
Music 🎵 : Illusions - Anno Domini Beats
Video editing: Nils Trost
Видео Moral Self-Correction in Large Language Models | paper explained канала AI Coffee Break with Letitia
Показать
Комментарии отсутствуют
Информация о видео
25 апреля 2023 г. 17:11:00
00:14:50
Другие видео канала
![What is LoRA? Low-Rank Adaptation for finetuning LLMs EXPLAINED](https://i.ytimg.com/vi/KEv-F5UkhxU/default.jpg)
![Are ChatBots their own death? | Training on Generated Data Makes Models Forget – Paper explained](https://i.ytimg.com/vi/rrMNWJ9qXlI/default.jpg)
![The first law on AI regulation | The EU AI Act](https://i.ytimg.com/vi/JOKXONV7LuA/default.jpg)
![Say that 3 times in a row. 😅](https://i.ytimg.com/vi/EV8v5P1t84U/default.jpg)
![Author Interviews, Poster Highlights, Summary of the ACL 2023 Toronto NLP](https://i.ytimg.com/vi/-Agcr0nawuk/default.jpg)
![ChatGPT ist not an intelligent agent. It is a cultural technology. – Gopnik Keynote](https://i.ytimg.com/vi/FPqxmkc_qZU/default.jpg)
![Do LLMs understand? Jay Alammar's TLDR of Geoffrey Hinton ACL2023 Keynote](https://i.ytimg.com/vi/BNA0QY79Xhk/default.jpg)
![[Own work] MM-SHAP to measure modality contributions](https://i.ytimg.com/vi/RLaiomLMK9I/default.jpg)
![Eight Things to Know about Large Language Models](https://i.ytimg.com/vi/RX-gGs_EV7M/default.jpg)
![Speaking about AI is hard, even for humans | AI Coffee Break Bloopers](https://i.ytimg.com/vi/w_fmoJz83Cs/default.jpg)
![AI beats us at another game: STRATEGO | DeepNash paper explained](https://i.ytimg.com/vi/3vO45gcEbRs/default.jpg)
![Why ChatGPT fails | Language Model Limitations EXPLAINED](https://i.ytimg.com/vi/XstVY5epRWs/default.jpg)
!["Watermarking Language Models" paper and GPTZero EXPLAINED | How to detect text by ChatGPT?](https://i.ytimg.com/vi/-vToUx5SDW4/default.jpg)
![Training learned optimizers: VeLO paper EXPLAINED](https://i.ytimg.com/vi/9a6PQJxzUpM/default.jpg)
![ChatGPT vs Sparrow - Battle of Chatbots](https://i.ytimg.com/vi/SWwQ3k-DWyo/default.jpg)
![Paella: Text to image FASTER than diffusion models | Paella paper explained](https://i.ytimg.com/vi/6zeLSANd41k/default.jpg)
![Generate long form video with Transformers | Phenaki from Google Brain explained](https://i.ytimg.com/vi/RYLomvaPWa4/default.jpg)
![Movie Diffusion explained | Make-a-Video from MetaAI and Imagen Video from Google Brain](https://i.ytimg.com/vi/AcvmyqGgMh8/default.jpg)
![Beyond neural scaling laws – Paper Explained](https://i.ytimg.com/vi/joZaCw5PxYs/default.jpg)
![How does Stable Diffusion work? – Latent Diffusion Models EXPLAINED](https://i.ytimg.com/vi/J87hffSMB60/default.jpg)