Загрузка страницы

This video was edited with AI agent. But how?

The talk is about world’s first open-source video editing agent!

Diffusion Studio x Re-Skill technology proposal:

Our Python-based agent starts a browser session using Playwright and opens operator.diffusion.studio.

This web app is a video editing UI optimized for agents, providing access to Diffusion Studio Core—a JavaScript-based engine that renders videos directly in the browser using WebCodecs (fully hardware-accelerated).

🖥 How it works:
1️⃣ A VideoEditingTool generates code based on user prompts and runs it in the browser.
2️⃣ If additional context is needed, DocsSearchTool uses RAG to pull information from operator.diffusion.studio/llms.txt.
3️⃣ After each execution step, the composition is sampled (currently 1 frame per second) and analyzed using VisualFeedbackTool via a multi-modal model.
4️⃣ The feedback system decides whether to proceed with rendering or refine further.

📡 File transfers between the browser and Python happen via Chrome DevTools Protocol, and for scalability, the agent can connect to a GPU-accelerated remote browser session via WebSocket (WIP: wss://chrome.diffusion.studio).

---

https://github.com/diffusionstudio/agent

https://re-skill.io/

slides: https://docs.google.com/presentation/d/1eipINYiwx3vjwvJXrv4QA0-9t4-uIVPuh112X9pElkM/edit?usp=sharing

Видео This video was edited with AI agent. But how? канала AI Engineer
Страницу в закладки Мои закладки
Все заметки Новая заметка Страницу в заметки