- Популярные видео
- Авто
- Видео-блоги
- ДТП, аварии
- Для маленьких
- Еда, напитки
- Животные
- Закон и право
- Знаменитости
- Игры
- Искусство
- Комедии
- Красота, мода
- Кулинария, рецепты
- Люди
- Мото
- Музыка
- Мультфильмы
- Наука, технологии
- Новости
- Образование
- Политика
- Праздники
- Приколы
- Природа
- Происшествия
- Путешествия
- Развлечения
- Ржач
- Семья
- Сериалы
- Спорт
- Стиль жизни
- ТВ передачи
- Танцы
- Технологии
- Товары
- Ужасы
- Фильмы
- Шоу-бизнес
- Юмор
Git for Your Data Lake — Why Agents Need Isolation and Rollback | Ciro Greco | PyAI Conf 2026
"Unlike code, data is non-local by default. If you change something, there's a bunch of production systems that depend on what you change. And unfortunately, we do not have git."
Ciro Greco, Co-founder and CEO at Bauplan and adjunct professor at Columbia University, at PyAI Conf 2026. He explains why AI coding assistants hit a wall when they try to touch production data — and what to do about it. Every pipeline run is an isolated branch, merges are atomic, and failed runs never reach production. Includes a live demo of an AI agent building and deploying a data pipeline with full git-like version control for your data lake.
0:00 - Why data infrastructure needs to be rebuilt for AI agents
0:44 - Software engineering matured fast with agents; data work is still catching up
1:36 - Why Claude Code works so well: local files, terminal feedback, and git
2:16 - The backfill problem: asking an agent to modify a production table
3:00 - Four missing pieces: isolation, atomicity, observability, and rollback
3:40 - "Data is non-local by default and we don't have git for it"
4:28 - Infrastructure that makes isolation and atomic updates automatic
5:07 - Agents generate 100x the code and run 100x the workload of a person
5:55 - Principle one: make everything git-like with branches, versions, and snapshots
6:44 - Principle two: squeeze your entire data platform into a Python package
7:24 - Everything is Python — tables, infrastructure, the outer loop
8:15 - Live demo: data branches as zero-copy versions of your data lake
9:08 - Commit history and time travel across your entire data lake
9:54 - The agent workflow: branching, running, and iterating from the terminal
10:42 - Building a backfill pipeline with an AI agent in real time
12:03 - Agent generates a Python pipeline script with declarative columns
12:33 - The agent runs, reads terminal output, and iterates until the pipeline works
13:46 - Verifying the new table exists on the feature branch but not on main
14:17 - Merging the branch: atomic publish to production
15:07 - Time travel and undo: nothing is permanent, everything is reversible
15:57 - Q&A begins
16:48 - How branches and hashing work with production writes and new data
17:28 - Atomic merges: multi-table publishes that either fully land or don't
17:59 - How customers actually adopt git semantics for data (mostly through the agent)
19:24 - Agents run 50-60 queries where a human runs 4-5
19:58 - Skills-based automation: data quality tests, log fetching, auto-fix branches
20:29 - Why data can't live in actual git: it's too large and always in the cloud
21:50 - "As far as the agent knows, it's just a git CLI for data"
LINKS:
https://www.bauplanlabs.com/
Видео Git for Your Data Lake — Why Agents Need Isolation and Rollback | Ciro Greco | PyAI Conf 2026 канала Py AI - Meetup and Conference Series
Ciro Greco, Co-founder and CEO at Bauplan and adjunct professor at Columbia University, at PyAI Conf 2026. He explains why AI coding assistants hit a wall when they try to touch production data — and what to do about it. Every pipeline run is an isolated branch, merges are atomic, and failed runs never reach production. Includes a live demo of an AI agent building and deploying a data pipeline with full git-like version control for your data lake.
0:00 - Why data infrastructure needs to be rebuilt for AI agents
0:44 - Software engineering matured fast with agents; data work is still catching up
1:36 - Why Claude Code works so well: local files, terminal feedback, and git
2:16 - The backfill problem: asking an agent to modify a production table
3:00 - Four missing pieces: isolation, atomicity, observability, and rollback
3:40 - "Data is non-local by default and we don't have git for it"
4:28 - Infrastructure that makes isolation and atomic updates automatic
5:07 - Agents generate 100x the code and run 100x the workload of a person
5:55 - Principle one: make everything git-like with branches, versions, and snapshots
6:44 - Principle two: squeeze your entire data platform into a Python package
7:24 - Everything is Python — tables, infrastructure, the outer loop
8:15 - Live demo: data branches as zero-copy versions of your data lake
9:08 - Commit history and time travel across your entire data lake
9:54 - The agent workflow: branching, running, and iterating from the terminal
10:42 - Building a backfill pipeline with an AI agent in real time
12:03 - Agent generates a Python pipeline script with declarative columns
12:33 - The agent runs, reads terminal output, and iterates until the pipeline works
13:46 - Verifying the new table exists on the feature branch but not on main
14:17 - Merging the branch: atomic publish to production
15:07 - Time travel and undo: nothing is permanent, everything is reversible
15:57 - Q&A begins
16:48 - How branches and hashing work with production writes and new data
17:28 - Atomic merges: multi-table publishes that either fully land or don't
17:59 - How customers actually adopt git semantics for data (mostly through the agent)
19:24 - Agents run 50-60 queries where a human runs 4-5
19:58 - Skills-based automation: data quality tests, log fetching, auto-fix branches
20:29 - Why data can't live in actual git: it's too large and always in the cloud
21:50 - "As far as the agent knows, it's just a git CLI for data"
LINKS:
https://www.bauplanlabs.com/
Видео Git for Your Data Lake — Why Agents Need Isolation and Rollback | Ciro Greco | PyAI Conf 2026 канала Py AI - Meetup and Conference Series
Комментарии отсутствуют
Информация о видео
27 апреля 2026 г. 18:30:37
00:22:17
Другие видео канала




















