- Популярные видео
- Авто
- Видео-блоги
- ДТП, аварии
- Для маленьких
- Еда, напитки
- Животные
- Закон и право
- Знаменитости
- Игры
- Искусство
- Комедии
- Красота, мода
- Кулинария, рецепты
- Люди
- Мото
- Музыка
- Мультфильмы
- Наука, технологии
- Новости
- Образование
- Политика
- Праздники
- Приколы
- Природа
- Происшествия
- Путешествия
- Развлечения
- Ржач
- Семья
- Сериалы
- Спорт
- Стиль жизни
- ТВ передачи
- Танцы
- Технологии
- Товары
- Ужасы
- Фильмы
- Шоу-бизнес
- Юмор
Invoice QC Service | PDF Extraction + Validation Pipeline | Python, FastAPI, CLI Demo
📌 Project Overview
This video is a walkthrough of my project Invoice QC Service, built as part of the Software Engineer Intern – Data & Development assignment. The system performs automatic PDF invoice extraction, schema validation, and quality control checks, and exposes the logic through both a CLI tool and a FastAPI backend.
🔧 Key Features
1️⃣ PDF → JSON Extraction Module
Extracts invoice fields (invoice number, dates, parties, totals, etc.)
Uses regex + pattern matching to process real B2B invoices
Supports optional line item extraction
2️⃣ Validation Engine
Completeness checks (missing fields, invalid dates, empty seller/buyer info)
Format rules (date parsing, currency validation)
Business rules (net + tax = gross, due date ≥ invoice date)
Duplicate detection & anomaly checks
3️⃣ Command-Line Interface (CLI)
Supports:
extract – Convert PDFs into structured JSON
validate – Run validation on extracted data
full-run – End-to-end extraction + validation
Generates detailed reports and summaries
4️⃣ FastAPI Backend (HTTP API)
Includes:
POST /validate-json – Validate invoice JSON payload
GET /health – Health check endpoint
(Optional) POST /extract-and-validate-pdfs
🖥 Tech Stack
Python 3.10+
FastAPI for backend APIs
pdfplumber / PyPDF2 for PDF extraction
argparse / Typer for CLI tools
Pydantic for data models
JSON reports for validation results
🧩 Architecture
PDFs → Extraction Module → JSON → Validation Engine → CLI / API / Optional UI
📁 Repository
GitHub Repo: https://github.com/Mysteriousboy727/invoice-extraction-qc-system.git
🎥 What’s in This Video?
Project overview
Explanation of schema and validation rules
Code walkthrough (extractor, validator, CLI, API)
Running CLI with sample PDFs
Demo of FastAPI endpoints in action
🧠 Why This Project Matters
This system demonstrates real-world skills in:
Data extraction
Backend development
Validation pipelines
API design
CLI engineering
Clean, modular Python architecture
Видео Invoice QC Service | PDF Extraction + Validation Pipeline | Python, FastAPI, CLI Demo канала Romeo
This video is a walkthrough of my project Invoice QC Service, built as part of the Software Engineer Intern – Data & Development assignment. The system performs automatic PDF invoice extraction, schema validation, and quality control checks, and exposes the logic through both a CLI tool and a FastAPI backend.
🔧 Key Features
1️⃣ PDF → JSON Extraction Module
Extracts invoice fields (invoice number, dates, parties, totals, etc.)
Uses regex + pattern matching to process real B2B invoices
Supports optional line item extraction
2️⃣ Validation Engine
Completeness checks (missing fields, invalid dates, empty seller/buyer info)
Format rules (date parsing, currency validation)
Business rules (net + tax = gross, due date ≥ invoice date)
Duplicate detection & anomaly checks
3️⃣ Command-Line Interface (CLI)
Supports:
extract – Convert PDFs into structured JSON
validate – Run validation on extracted data
full-run – End-to-end extraction + validation
Generates detailed reports and summaries
4️⃣ FastAPI Backend (HTTP API)
Includes:
POST /validate-json – Validate invoice JSON payload
GET /health – Health check endpoint
(Optional) POST /extract-and-validate-pdfs
🖥 Tech Stack
Python 3.10+
FastAPI for backend APIs
pdfplumber / PyPDF2 for PDF extraction
argparse / Typer for CLI tools
Pydantic for data models
JSON reports for validation results
🧩 Architecture
PDFs → Extraction Module → JSON → Validation Engine → CLI / API / Optional UI
📁 Repository
GitHub Repo: https://github.com/Mysteriousboy727/invoice-extraction-qc-system.git
🎥 What’s in This Video?
Project overview
Explanation of schema and validation rules
Code walkthrough (extractor, validator, CLI, API)
Running CLI with sample PDFs
Demo of FastAPI endpoints in action
🧠 Why This Project Matters
This system demonstrates real-world skills in:
Data extraction
Backend development
Validation pipelines
API design
CLI engineering
Clean, modular Python architecture
Видео Invoice QC Service | PDF Extraction + Validation Pipeline | Python, FastAPI, CLI Demo канала Romeo
Комментарии отсутствуют
Информация о видео
5 декабря 2025 г. 23:26:36
00:03:15
Другие видео канала
