Fine-tune High Performance Sentence Transformers (with Multiple Negatives Ranking)
Transformer-produced sentence embeddings have come a long way in a very short time. Starting with the slow but accurate similarity prediction of BERT cross-encoders, the world of sentence embeddings was ignited with the introduction of SBERT in 2019. Since then, many more sentence transformers have been introduced. These models quickly made the original SBERT obsolete.
How did these newer sentence transformers manage to outperform SBERT so quickly? The answer is multiple negatives ranking (MNR) loss.
This video will cover what MNR loss is, the data it requires, and how to implement it to fine-tune our own high-quality sentence transformers.
Implementation will cover two approaches. The first is more involved, and outlines the exact steps to fine-tune the model (we'll just run over it quickly). The second approach makes use of the sentence-transformers library’s excellent utilities for fine-tuning.
🌲 Pinecone article:
https://www.pinecone.io/learn/fine-tune-sentence-transformers-mnr/
Check out the Sentence Transformers library:
https://github.com/UKPLab/sentence-transformers
Talk by Nils Reimers (one of the SBERT creators) on training:
https://www.youtube.com/watch?v=RHXZKUr8qOY
He does more NLP vids too:
https://www.youtube.com/channel/UC1zCuTrfpjT6Sv2kJk-JkvA
🤖 70% Discount on the NLP With Transformers in Python course:
https://bit.ly/3DFvvY5
🎉 Subscribe for Article and Video Updates!
https://jamescalam.medium.com/subscribe
https://medium.com/@jamescalam/membership
👾 Discord:
https://discord.gg/c5QtDB9RAP
00:00 Intro
01:02 NLI Training Data
02:56 Preprocessing
10:11 SBERT Finetuning Visuals
14:14 MNR Loss Visual
16:37 MNR in PyTorch
23:04 MNR in Sentence Transformers
34:20 Results
36:14 Outro
Видео Fine-tune High Performance Sentence Transformers (with Multiple Negatives Ranking) канала James Briggs
How did these newer sentence transformers manage to outperform SBERT so quickly? The answer is multiple negatives ranking (MNR) loss.
This video will cover what MNR loss is, the data it requires, and how to implement it to fine-tune our own high-quality sentence transformers.
Implementation will cover two approaches. The first is more involved, and outlines the exact steps to fine-tune the model (we'll just run over it quickly). The second approach makes use of the sentence-transformers library’s excellent utilities for fine-tuning.
🌲 Pinecone article:
https://www.pinecone.io/learn/fine-tune-sentence-transformers-mnr/
Check out the Sentence Transformers library:
https://github.com/UKPLab/sentence-transformers
Talk by Nils Reimers (one of the SBERT creators) on training:
https://www.youtube.com/watch?v=RHXZKUr8qOY
He does more NLP vids too:
https://www.youtube.com/channel/UC1zCuTrfpjT6Sv2kJk-JkvA
🤖 70% Discount on the NLP With Transformers in Python course:
https://bit.ly/3DFvvY5
🎉 Subscribe for Article and Video Updates!
https://jamescalam.medium.com/subscribe
https://medium.com/@jamescalam/membership
👾 Discord:
https://discord.gg/c5QtDB9RAP
00:00 Intro
01:02 NLI Training Data
02:56 Preprocessing
10:11 SBERT Finetuning Visuals
14:14 MNR Loss Visual
16:37 MNR in PyTorch
23:04 MNR in Sentence Transformers
34:20 Results
36:14 Outro
Видео Fine-tune High Performance Sentence Transformers (with Multiple Negatives Ranking) канала James Briggs
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![Intro to Sentence Embeddings with Transformers](https://i.ytimg.com/vi/WS1uVMGhlWQ/default.jpg)
![BERTopic Explained](https://i.ytimg.com/vi/fb7LENb9eag/default.jpg)
![Long Form Question Answering (LFQA) in Haystack](https://i.ytimg.com/vi/O9lrWt15wH8/default.jpg)
![Sentence Transformers SBERT 2022 - new models AI NLProc](https://i.ytimg.com/vi/ewlCCB7EFPs/default.jpg)
![Spotify's Podcast Search Explained](https://i.ytimg.com/vi/ok0SDdXdat8/default.jpg)
![Pytorch Transformers from Scratch (Attention is all you need)](https://i.ytimg.com/vi/U0s0f995w14/default.jpg)
![Sentence Similarity With Sentence-Transformers in Python](https://i.ytimg.com/vi/Ey81KfQ3PQU/default.jpg)
![Train Sentence Transformers by Generating Queries (GenQ)](https://i.ytimg.com/vi/J0cntjLKpmU/default.jpg)
![3 Vector-based Methods for Similarity Search (TF-IDF, BM25, SBERT)](https://i.ytimg.com/vi/ziiF1eFM3_4/default.jpg)
![Training State-of-the-Art Sentence Embedding Models](https://i.ytimg.com/vi/RHXZKUr8qOY/default.jpg)
![Adding New Doc Stores to Haystack](https://i.ytimg.com/vi/JydpRavoJqI/default.jpg)
![Is GPL the Future of Sentence Transformers? | Generative Pseudo-Labeling Deep Dive](https://i.ytimg.com/vi/uEbCXwInnPs/default.jpg)
![How to build a Q&A Reader Model in Python (Open-domain QA)](https://i.ytimg.com/vi/-fzCSPsfMic/default.jpg)
![How to build a Q&A AI in Python (Open-domain Question-Answering)](https://i.ytimg.com/vi/w1dMEWm7jBc/default.jpg)
![11 Secrets to Memorize Things Quicker Than Others](https://i.ytimg.com/vi/mHdy1xS59xA/default.jpg)
![How to Embed Sentences using Google's Universal Sentence Encoder](https://i.ytimg.com/vi/dMwEvbpT44I/default.jpg)
![Illustrated Guide to Transformers Neural Network: A step by step explanation](https://i.ytimg.com/vi/4Bdc55j80l8/default.jpg)
![New GPU-Acceleration for PyTorch on M1 Macs! + using with BERT](https://i.ytimg.com/vi/uYas6ysyjgY/default.jpg)
![Tutorial 2- Fine Tuning Pretrained Model On Custom Dataset Using 🤗 Transformer](https://i.ytimg.com/vi/V1-Hm2rNkik/default.jpg)
![AugSBERT: Domain Transfer for Sentence Transformers](https://i.ytimg.com/vi/a8jyue22SJM/default.jpg)