Lev Konstantinovskiy - Text similiarity with the next generation of word embeddings in Gensim
Description
What is the closest word to "king"? Is it "Canute" or is it "crowned"? There are many ways to define "similar words" and "similar texts". Depending on your definition you should choose a word embedding to use. There is a new generation of word embeddings added to Gensim open source NLP package using morphological information and learning-to-rank: Facebook's FastText, VarEmbed and WordRank.
Abstract
There are many ways to find similar words/docs with an open-source Natural Language processing library Gensim that I maintain. I will give an overview of modern word embeddings like Google's Word2vec, Facebook's FastText, GloVe, WordRank, VarEmbed and discuss what business tasks fit them best.
What is the most similar word to "king"? It depends on what you mean by similar. "King" can be interchanged with "Canute", but it's attribute is "crown". We will discuss how to achieve these two kinds of similarity from word embeddings.
www.pydata.org
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
Видео Lev Konstantinovskiy - Text similiarity with the next generation of word embeddings in Gensim канала PyData
What is the closest word to "king"? Is it "Canute" or is it "crowned"? There are many ways to define "similar words" and "similar texts". Depending on your definition you should choose a word embedding to use. There is a new generation of word embeddings added to Gensim open source NLP package using morphological information and learning-to-rank: Facebook's FastText, VarEmbed and WordRank.
Abstract
There are many ways to find similar words/docs with an open-source Natural Language processing library Gensim that I maintain. I will give an overview of modern word embeddings like Google's Word2vec, Facebook's FastText, GloVe, WordRank, VarEmbed and discuss what business tasks fit them best.
What is the most similar word to "king"? It depends on what you mean by similar. "King" can be interchanged with "Canute", but it's attribute is "crown". We will discuss how to achieve these two kinds of similarity from word embeddings.
www.pydata.org
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
Видео Lev Konstantinovskiy - Text similiarity with the next generation of word embeddings in Gensim канала PyData
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Google Colab - Find Closest Match in Google Sheets using Fuzzy Wuzzy with Python PandasAbstractive Text Summarization on Zomato Fine Food ReviewsRobert Meyer - Analysing user comments with Doc2Vec and Machine Learning classificationMatti Lyra - Evaluating Topic ModelsWord Embeddings[SAIF 2019] Day 1: New Directions in Automatic Text Summarization - Jackie Cheung | SamsungCompare the similarity of two Wikipedia's articles using Python Natural language processingVectoring Words (Word Embeddings) - ComputerphileBERT v/s Word2Vec Simplest Example(Re)training word embeddings for a specific domain - Jetze SchuurmansIntroduction to Document SimilarityTrent Hauck: Low Friction NLP with GensimNatural Language Processing (Part 5): Topic Modeling with Latent Dirichlet Allocation in PythonGrowth Hacking: Data and Product Driven Marketing - David ArnouxEmbeddings for Everything: Search in the Neural Network EraWord2Vec and Text Classification (11.2)Applying the four step "Embed, Encode, Attend, Predict" framework to predict document similarityLDA Topic ModelsWord Embeddings with BERT - Kaggle Nlp Real or Not? text classification competition Part 2Bag of words , TFIDF , TfidfVectorizer, Cosine Similarity, NLProc basics tutorial