FAQ #1: Tips & tricks for NLP, annotation & training with Prodigy and spaCy
Prodigy is an annotation tool for creating training data for machine learning models. In this video, I'll be talking about a few frequently asked questions and share some general tips and tricks for how to structure your NLP annotation projects, how to design your label schemes and how to solve common problems.
PRODIGY
● Website: https://prodi.gy
● Forum: https://support.prodi.gy
● Recipes repo: https://github.com/explosion/prodigy-recipes
THIS VIDEO
[0:46] Binary of manual annotation?
● ner.teach vs. ner.match https://support.prodi.gy/t/877
● Best practices for validation sets https://support.prodi.gy/t/693
[3:34] Accept or reject partial suggestions?
● How to score incompletely highlighted entities https://support.prodi.gy/t/625
● Should I reject or accept partially correct predictions? https://support.prodi.gy/t/945
[5:35] Reject example or skip it?
● Reject or skip examples for text classifier annotations https://support.prodi.gy/t/998
● Ignored sentences for text classification https://support.prodi.gy/t/1183
[7:30] What if I need to label long texts?
Dealing with sparse data https://support.prodi.gy/t/518
Text categorization at document level https://support.prodi.gy/t/1160
[9:24] Fine-tune pre-trained model or start from scratch?
● Pre-trained model vs training a model from scratch https://support.prodi.gy/t/631/4
● Fact extraction for earnings news https://support.prodi.gy/t/1023
● Extracting current and prior company affiliations from bios https://support.prodi.gy/t/1176
● NER or PhraseMatcher https://support.prodi.gy/t/686
FOLLOW US
● Explosion AI: https://twitter.com/explosion_ai
● Ines Montani: https://twitter.com/_inesmontani
● Matthew Honnibal: https://twitter.com/honnibal
Видео FAQ #1: Tips & tricks for NLP, annotation & training with Prodigy and spaCy канала Explosion
PRODIGY
● Website: https://prodi.gy
● Forum: https://support.prodi.gy
● Recipes repo: https://github.com/explosion/prodigy-recipes
THIS VIDEO
[0:46] Binary of manual annotation?
● ner.teach vs. ner.match https://support.prodi.gy/t/877
● Best practices for validation sets https://support.prodi.gy/t/693
[3:34] Accept or reject partial suggestions?
● How to score incompletely highlighted entities https://support.prodi.gy/t/625
● Should I reject or accept partially correct predictions? https://support.prodi.gy/t/945
[5:35] Reject example or skip it?
● Reject or skip examples for text classifier annotations https://support.prodi.gy/t/998
● Ignored sentences for text classification https://support.prodi.gy/t/1183
[7:30] What if I need to label long texts?
Dealing with sparse data https://support.prodi.gy/t/518
Text categorization at document level https://support.prodi.gy/t/1160
[9:24] Fine-tune pre-trained model or start from scratch?
● Pre-trained model vs training a model from scratch https://support.prodi.gy/t/631/4
● Fact extraction for earnings news https://support.prodi.gy/t/1023
● Extracting current and prior company affiliations from bios https://support.prodi.gy/t/1176
● NER or PhraseMatcher https://support.prodi.gy/t/686
FOLLOW US
● Explosion AI: https://twitter.com/explosion_ai
● Ines Montani: https://twitter.com/_inesmontani
● Matthew Honnibal: https://twitter.com/honnibal
Видео FAQ #1: Tips & tricks for NLP, annotation & training with Prodigy and spaCy канала Explosion
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![Training a NAMED ENTITY RECOGNITION MODEL with Prodigy and Transfer Learning](https://i.ytimg.com/vi/59BKHO_xBPA/default.jpg)
![NLP for Developers: Annotating Language Data | Rasa](https://i.ytimg.com/vi/OqdPoWmRPBU/default.jpg)
![Matthew Honnibal & Ines Montani: spaCy and Explosion: past, present & future (spaCy IRL 2019)](https://i.ytimg.com/vi/Jk9y17lvltY/default.jpg)
![Ines Montani - How to Ignore Most Startup Advice and Build a Decent Software Business](https://i.ytimg.com/vi/74AsJ7RET20/default.jpg)
![TRAINING A NEW ENTITY TYPE with Prodigy – annotation powered by active learning](https://i.ytimg.com/vi/l4scwf8KeIA/default.jpg)
![Interview w Ines Montani | Spacy, NLP & Open Source Frameworks | Explosion.ai, Thinc.ai & Prodi.gy](https://i.ytimg.com/vi/C5DGFSDlMBM/default.jpg)
![Active Learning: Why Smart Labeling is the Future of Data Annotation | Alectio](https://i.ytimg.com/vi/V33Ut36eUsY/default.jpg)
![Unboxing Six Open Source Annotation Tools - episode C01](https://i.ytimg.com/vi/OOhZqXNGJcE/default.jpg)
![Amit Beka: Annotating data the right way | PyData Amsterdam 2019](https://i.ytimg.com/vi/U8qS0ZpWp0I/default.jpg)
![Data Annotators: The Unsung Heroes Of AI Development - The Medical Futurist](https://i.ytimg.com/vi/hhzhamJUbmg/default.jpg)
![Advanced NLP with spaCy · A free online course](https://i.ytimg.com/vi/THduWAnG97k/default.jpg)
![SPACY'S ENTITY RECOGNITION MODEL: incremental parsing with Bloom embeddings & residual CNNs](https://i.ytimg.com/vi/sqDHBH9IjRU/default.jpg)
![TRAINING AN INSULTS CLASSIFIER with Prodigy in ~1 hour](https://i.ytimg.com/vi/5di0KlKl0fE/default.jpg)
![Kairntech Sherpa Platform - Text Annotation Tool to build Training Dataset and Learning Models.](https://i.ytimg.com/vi/_2vSH0S6ZkY/default.jpg)
![Ten Ways To Persuade Using NLP](https://i.ytimg.com/vi/BaXuBxp3E2M/default.jpg)
![Building new NLP solutions with spaCy and Prodigy - Matthew Honnibal](https://i.ytimg.com/vi/jpWqz85F_4Y/default.jpg)
![SPACY v3: Custom trainable relation extraction component](https://i.ytimg.com/vi/8HL-Ap5_Axo/default.jpg)
![PRODIGY v1.10: Dependencies, relations, audio, video, extended NER and image annotation & lots more](https://i.ytimg.com/vi/KCrIa538u4I/default.jpg)
![SPACY v3: State-of-the-art NLP from Prototype to Production](https://i.ytimg.com/vi/9k_EfV7Cns0/default.jpg)
![SpaCy NER Annotation tool - A tool to annotate and create training data for SpaCy NER](https://i.ytimg.com/vi/NrZPrUnuNd0/default.jpg)