Day 3 - Parsing unstructured PDF's into pandas dataframe - Interview preperation
My name is Divyaprakash and I'm a Data Scientist with 1.5 years of experience. As part of my job hunt, I'm taking part in daily data science challenge, where I'm going to document the day to day works I do as part of my upskilling journey.
Day 3: Text preprocessing - Parsing unstructured PDF's into pandas dataframe
🔍 PDF Bank Statement to Structured Data using Python
In this video, we dive into extracting transactions from a bank statement PDF and converting it into a structured pandas DataFrame using Python! We use pdfplumber for reading PDFs and regex for parsing transaction details.
📌 What you’ll learn:
✅ How to use pdfplumber to extract text from PDFs
✅ Regular expressions to detect dates and amounts
✅ Logic to split transaction blocks into structured rows
✅ Build a transaction dataset directly from unstructured bank statements
Google collab link : https://colab.research.google.com/drive/1KFATv7J3PrweQE4eBzC4hxNWwCoxZVDA?usp=sharing
Linkedin : https://www.linkedin.com/in/divyaprakash-rathinasabapathy/
Github : https://github.com/rdivyaprakash78
Видео Day 3 - Parsing unstructured PDF's into pandas dataframe - Interview preperation канала Divyaprakash R
Day 3: Text preprocessing - Parsing unstructured PDF's into pandas dataframe
🔍 PDF Bank Statement to Structured Data using Python
In this video, we dive into extracting transactions from a bank statement PDF and converting it into a structured pandas DataFrame using Python! We use pdfplumber for reading PDFs and regex for parsing transaction details.
📌 What you’ll learn:
✅ How to use pdfplumber to extract text from PDFs
✅ Regular expressions to detect dates and amounts
✅ Logic to split transaction blocks into structured rows
✅ Build a transaction dataset directly from unstructured bank statements
Google collab link : https://colab.research.google.com/drive/1KFATv7J3PrweQE4eBzC4hxNWwCoxZVDA?usp=sharing
Linkedin : https://www.linkedin.com/in/divyaprakash-rathinasabapathy/
Github : https://github.com/rdivyaprakash78
Видео Day 3 - Parsing unstructured PDF's into pandas dataframe - Interview preperation канала Divyaprakash R
Комментарии отсутствуют
Информация о видео
11 мая 2025 г. 15:45:15
00:21:41
Другие видео канала