Загрузка...

How to Install Tesseract OCR on Windows and use it with Python

This is a walkthrough for installing tesseract on Windows and configuring it to be able to programatically use it with Python.

As a bonus I show how you can parallelized execution with the multiprocessing package Pool module on Windows.

Also, the recon screenshot tool for Bug Bounties I was thinking of is GoWitness. https://github.com/sensepost/gowitness
Use your powers for good.....or evil. Whatever, I'm a YouTuber not a cop.

00:00 Intro/What is Tesseract
00:55 Installing pytesseract with pip
01:55 Installing Tesseract OCR exe
04:00 Extracting text
04:12 Adding Tesseract to Path
09:45 Creating a function to classify documents
19:50 A bit about what an image file really is
22:10 Handling errors gracefully
22:35 Adding Return Codes
26:15 Vertical Scaling using Multiprocessing
30:00 Putting it all together
35:00 Demo of how multiprocessing doesn't work in a notebook on Windows
36:14 Conclusion

Видео How to Install Tesseract OCR on Windows and use it with Python канала Data Slinger
Яндекс.Метрика
Все заметки Новая заметка Страницу в заметки
Страницу в закладки Мои закладки
На информационно-развлекательном портале SALDA.WS применяются cookie-файлы. Нажимая кнопку Принять, вы подтверждаете свое согласие на их использование.
О CookiesНапомнить позжеПринять