TrOCR Transformer-based Optical Character Recognition Microsoft Hugging Face TrOCR Demo
In this video I look at TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models. The TrOCR model was proposed in TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models by Minghao Li, Tengchao Lv, Lei Cui, Yijuan Lu, Dinei Florencio, Cha Zhang, Zhoujun Li, Furu Wei. TrOCR consists of an image Transformer encoder and an autoregressive text Transformer decoder to perform optical character recognition (OCR). I also show a demo of TrOCR using a Google Collab notebook
If you like such content please subscribe to the channel here:
https://www.youtube.com/c/RitheshSreenivasan?sub_confirmation=1
If you like to support me financially, It is totally optional and voluntary. Buy me a coffee here: https://www.buymeacoffee.com/rithesh
Relevant Links:
https://arxiv.org/abs/2109.10282
https://huggingface.co/docs/transformers/model_doc/trocr
https://colab.research.google.com/drive/1LBQtUdUXBeo4m6zfh270ae1ntjEN-OPT?usp=sharing
https://huggingface.co/spaces/nielsr/TrOCR-handwritten
https://github.com/microsoft/unilm/tree/master/trocr
Видео TrOCR Transformer-based Optical Character Recognition Microsoft Hugging Face TrOCR Demo канала Rithesh Sreenivasan
If you like such content please subscribe to the channel here:
https://www.youtube.com/c/RitheshSreenivasan?sub_confirmation=1
If you like to support me financially, It is totally optional and voluntary. Buy me a coffee here: https://www.buymeacoffee.com/rithesh
Relevant Links:
https://arxiv.org/abs/2109.10282
https://huggingface.co/docs/transformers/model_doc/trocr
https://colab.research.google.com/drive/1LBQtUdUXBeo4m6zfh270ae1ntjEN-OPT?usp=sharing
https://huggingface.co/spaces/nielsr/TrOCR-handwritten
https://github.com/microsoft/unilm/tree/master/trocr
Видео TrOCR Transformer-based Optical Character Recognition Microsoft Hugging Face TrOCR Demo канала Rithesh Sreenivasan
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
How to Create an illustrated story with GPT-4 and DALL·E 3 from an image and one line text ChatGPTWoodpecker: Hallucination Correction for Multimodal Large Language ModelsGPT4-V (Vision) Jailbreak #gpt4 #languagemodelQuery your CSV Dataset with Microsoft LIDA Automatic Generation of Visualizations with LLMsWhat is retrieval-augmented generation? #languagemodelsGPT-4 with vision ( GPT-4V ) Crazy JailbreakMemGPT LLMs with Memory , LLMs as Operating System , Infinite Context ,Towards AGIRun Llama 2 Locally On CPU without GPU GGUF Quantized Models Colab Notebook DemoQwen-VL-Chat Powerful Multimodal Model From Ali Baba Tops Benchmarks Colab Demo Paper DiscussionIdeogram AI New Free Text to Image AIMeta AI Code Llama Colab Tutorial Llama2 for generating codeMeta AI Code Llama LLM Llama 2 Trained for Generating CodeUse Llama 2 70B LLM For FreeGoogle Med-PaLM M Generalist Biomedical AI Paper ExplanationMeta AI Llama 2 LLM Is Here Free and Open Source 🔥🔥🔥🔥🔥embedChain Create LLM powered bots over any dataset Python Demo Tesla Neurallink Chatbot ExampleFastSAM 50X faster than Meta AI SAM Segment Anything Model #computervisionAwesome Images from SDXL 0.9 text to image AI ClipDrop tool Stability AISDXL 0.9 Text to Image AI Stable Diffusion on SteroidsMosaicML MPT 30-B Bigger Better Cheaper Open Commercially UsableMicrosoft Orca 13 B LLM close to GPT-4 in performance