A much better LLM Leaderboard!!!
🏆 This leaderboard is based on the following three benchmarks.
Chatbot Arena - a crowdsourced, randomized battle platform. We use 100K+ user votes to compute Elo ratings.
MT-Bench - a set of challenging multi-turn questions. We use GPT-4 to grade the model responses.
MMLU (5-shot) - a test to measure a model's multitask accuracy on 57 tasks.
🔗 Links 🔗
ChatBOT Arena Leaderboard from Lmsys - https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard
Arena Leaderboard Elo Ranking Method - https://colab.research.google.com/drive/1RAWb22-PFNI-X1gPVzc927SGUdfr6nsR?usp=sharing
Play at the Arena - https://chat.lmsys.org/?arena
Intro Sound from Honest Trailers- https://youtu.be/lZMzf-SDWP8
❤️ If you want to support the channel ❤️
Support here:
Patreon - https://www.patreon.com/1littlecoder/
Ko-Fi - https://ko-fi.com/1littlecoder
🧭 Follow me on 🧭
Twitter - https://twitter.com/1littlecoder
Linkedin - https://www.linkedin.com/in/amrrs/
Видео A much better LLM Leaderboard!!! канала 1littlecoder
Chatbot Arena - a crowdsourced, randomized battle platform. We use 100K+ user votes to compute Elo ratings.
MT-Bench - a set of challenging multi-turn questions. We use GPT-4 to grade the model responses.
MMLU (5-shot) - a test to measure a model's multitask accuracy on 57 tasks.
🔗 Links 🔗
ChatBOT Arena Leaderboard from Lmsys - https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard
Arena Leaderboard Elo Ranking Method - https://colab.research.google.com/drive/1RAWb22-PFNI-X1gPVzc927SGUdfr6nsR?usp=sharing
Play at the Arena - https://chat.lmsys.org/?arena
Intro Sound from Honest Trailers- https://youtu.be/lZMzf-SDWP8
❤️ If you want to support the channel ❤️
Support here:
Patreon - https://www.patreon.com/1littlecoder/
Ko-Fi - https://ko-fi.com/1littlecoder
🧭 Follow me on 🧭
Twitter - https://twitter.com/1littlecoder
Linkedin - https://www.linkedin.com/in/amrrs/
Видео A much better LLM Leaderboard!!! канала 1littlecoder
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![Open HTML Table from Web Page as Pandas Dataframe](https://i.ytimg.com/vi/7JvPCdGc6DI/default.jpg)
![ArxivGPT - Free Chrome Extension to Summarize Arxiv Papers using ChatGPT (with & without API token)](https://i.ytimg.com/vi/2da3JKYO-Uc/default.jpg)
![They Mixed Every small LLM Into One LARGE Expert!!!](https://i.ytimg.com/vi/oxVbNRMBrj4/default.jpg)
![AI Models Guide for Consultants & Product Managers | Hugging Face Tasks](https://i.ytimg.com/vi/iHIU-bXkpnk/default.jpg)
![Support for password authentication was removed. Please use a personal access token - RStudio Cloud](https://i.ytimg.com/vi/5-3XeD7vQ3s/default.jpg)
![AI Converts Photo to 3D Point Cloud Object with OpenAI Point-E - Part 3 Tutorial](https://i.ytimg.com/vi/wMvyMKv6uug/default.jpg)
![CRAZY AI News Last week](https://i.ytimg.com/vi/Vju6anNQPRs/default.jpg)
![News Story Prediction from URL using ML {storysniffer} Tutorial](https://i.ytimg.com/vi/SNdRi1OtXhY/default.jpg)
![3. Getting started with Gradio for ML UI Development | Gradio ML App Course](https://i.ytimg.com/vi/RiQmBwYWMVQ/default.jpg)
![PyScript Tutorial - Learn plotting Matplotlib Charts on PyScript Web #6](https://i.ytimg.com/vi/OopCsR9zx2I/default.jpg)
![DataBricks' AI Acquisition!](https://i.ytimg.com/vi/aw58iNg3C68/default.jpg)
![Bing Chat can create Free AI Images!](https://i.ytimg.com/vi/LsFZsSUb3Zs/default.jpg)
![Load Large Language Model (2B Param) on Colab without OOM Memory Issue | Pytorch Tensor GPU](https://i.ytimg.com/vi/wHvcqdwJI6M/default.jpg)
![Kaggle 30 Days of ML (Day 13) - Scikit-Learn Pipeline, CrossValidation - Learn Python ML in 30 Days](https://i.ytimg.com/vi/P6_SYSjFJPA/default.jpg)
![Study with Me - Fast.AI's Deep Learning Course - 4 Natural Language Processing](https://i.ytimg.com/vi/ACFnbaSzCJU/default.jpg)
![Self-driving Car Engineer sentenced, arXiv Dataset, AI/ML Startup Idea - Machine Learning Tech News](https://i.ytimg.com/vi/PqyARcXTtWA/default.jpg)
![Adobe FireFly First Look! 🤯 Impressive AI Art and AI Text Effects](https://i.ytimg.com/vi/WRNtjsKVbJI/default.jpg)
![4 different ways of uploading files to a Github Repo](https://i.ytimg.com/vi/3FQADnnSWGw/default.jpg)
![Weather Report from Shell Command Terminal #ytshorts #shorts](https://i.ytimg.com/vi/HNIIqsmm_LM/default.jpg)
![Watch ChatGPT play chess like a grandmaster!](https://i.ytimg.com/vi/PxBQGz9AbUk/default.jpg)
![AI Assistants with OPEN MODELS!!!](https://i.ytimg.com/vi/QEv48nXXs1U/default.jpg)