- Популярные видео
- Авто
- Видео-блоги
- ДТП, аварии
- Для маленьких
- Еда, напитки
- Животные
- Закон и право
- Знаменитости
- Игры
- Искусство
- Комедии
- Красота, мода
- Кулинария, рецепты
- Люди
- Мото
- Музыка
- Мультфильмы
- Наука, технологии
- Новости
- Образование
- Политика
- Праздники
- Приколы
- Природа
- Происшествия
- Путешествия
- Развлечения
- Ржач
- Семья
- Сериалы
- Спорт
- Стиль жизни
- ТВ передачи
- Танцы
- Технологии
- Товары
- Ужасы
- Фильмы
- Шоу-бизнес
- Юмор
LLM Enhanced Scraping With Zyte API | Omkar Dhavalikar
Building a Scalable & Cost-Effective Web Scraping Pipeline
In this live session, we dive deep into designing a scalable, cost-efficient web scraping and data extraction pipeline using Zyte API, AWS Lambda, Step Functions, and LLMs.
🌱 You’ll learn how to:
- Use Zyte API and Lambda functions for large-scale scraping (10,000+ pages/day)
- Optimize cost and compute with batching & concurrency
- Apply cleaning scripts to reduce token size for LLMs
- Run extraction logic efficiently with open-source or paid models
- Handle hallucinations, retries, and QA checks for reliable outputs
- Build a production-ready workflow with AWS orchestration
Zyte API (free credits): https://www.zyte.com/zyte-api/?utm_campaign=Discord_web_34&utm_activity=Community&utm_medium=social&utm_source=Discord
⚠️ Disclaimer: Always check a website’s Terms of Service and robots.txt before scraping. This content is educational, use ethically!
Join Extract Data Discord Community: https://discord.gg/eN83rMWqAt
A thriving community of 20K+ web scraping enthusiasts, committed to sharing insights, learning and exploring new technologies, and advancing in web scraping.
#awslambda #llm #dataengineering #ai #python #StepFunctions #zyteapi #cloudcomputing #machinelearning #datapipeline #bigdata
Видео LLM Enhanced Scraping With Zyte API | Omkar Dhavalikar канала Extract Summit
In this live session, we dive deep into designing a scalable, cost-efficient web scraping and data extraction pipeline using Zyte API, AWS Lambda, Step Functions, and LLMs.
🌱 You’ll learn how to:
- Use Zyte API and Lambda functions for large-scale scraping (10,000+ pages/day)
- Optimize cost and compute with batching & concurrency
- Apply cleaning scripts to reduce token size for LLMs
- Run extraction logic efficiently with open-source or paid models
- Handle hallucinations, retries, and QA checks for reliable outputs
- Build a production-ready workflow with AWS orchestration
Zyte API (free credits): https://www.zyte.com/zyte-api/?utm_campaign=Discord_web_34&utm_activity=Community&utm_medium=social&utm_source=Discord
⚠️ Disclaimer: Always check a website’s Terms of Service and robots.txt before scraping. This content is educational, use ethically!
Join Extract Data Discord Community: https://discord.gg/eN83rMWqAt
A thriving community of 20K+ web scraping enthusiasts, committed to sharing insights, learning and exploring new technologies, and advancing in web scraping.
#awslambda #llm #dataengineering #ai #python #StepFunctions #zyteapi #cloudcomputing #machinelearning #datapipeline #bigdata
Видео LLM Enhanced Scraping With Zyte API | Omkar Dhavalikar канала Extract Summit
web scraping AWS Lambda tutorial AWS Step Functions scalable web scraping Zyte API data engineering Python web scraping cost effective web scraping large scale scraping LLM data extraction AWS Lambda scraping machine learning pipeline AI data pipeline cleaning HTML data token optimization cloud computing serverless scraping hallucination handling LLM fuzzy matching python python data cleaning AI powered scraping open source LLM LLM fine tuning
Комментарии отсутствуют
Информация о видео
4 сентября 2025 г. 8:55:10
00:24:57
Другие видео канала




















