Optical Character Recognition (OCR) with Meta's Nougat!

17 Просмотры
Издатель
Tutorial on applying Meta's Nougat Transformer model within Python. The model can extract text, math equations in LaTeX format and tables from PDFs. The data is extracted into a MultiMarkdown file.

Nougat Paper on arXiv - https://arxiv.org/abs/2308.13418
Nougat Code on GitHub - https://github.com/facebookresearch/nougat
Overleat (LaTeX Editor) - https://www.overleaf.com/

The notebook can be found in the "Machine Learning" folder within the below repo.
GitHub Repo - https://github.com/ad17171717/YouTube-Tutorials/blob/main/Machine%20Learning%20with%20Python/Optical_Character_Recognition_(OCR)_with_Meta's_Nougat!.ipynb

CONNECT:
LinkedIn: https://www.linkedin.com/in/adrian-dolinay-frm-96a289106/
GitHub: https://github.com/ad17171717
Twitter: https://twitter.com/DolinayG
Odysee: https://odysee.com/@adriandolinay:0
Medium: https://medium.com/@adriandolinay

|-Video Chapters-|
0:00 - Intro
1:01 - Setting a GPU runtime in Google Colab
1:21 - Installing Nougat and importing modules
1:31 - Running Nougat from the command line in a Jupyter Notebook
2:16 - OCR a natively digital PDF
5:00 - OCR a scanned PDF
6:46 - Batch processing multiple PDFs with OCR
10:05 - References and additional learning
Категория
Занимательная механика
Комментариев нет.