- Популярные видео
- Авто
- Видео-блоги
- ДТП, аварии
- Для маленьких
- Еда, напитки
- Животные
- Закон и право
- Знаменитости
- Игры
- Искусство
- Комедии
- Красота, мода
- Кулинария, рецепты
- Люди
- Мото
- Музыка
- Мультфильмы
- Наука, технологии
- Новости
- Образование
- Политика
- Праздники
- Приколы
- Природа
- Происшествия
- Путешествия
- Развлечения
- Ржач
- Семья
- Сериалы
- Спорт
- Стиль жизни
- ТВ передачи
- Танцы
- Технологии
- Товары
- Ужасы
- Фильмы
- Шоу-бизнес
- Юмор
Why this 1.3B vision model uses 40x fewer tokens #llm #aiforanalysts #dataengineering #ai #aiagents
Questions worth thinking about:
1. The 43x token saving is measured against the reasoning variant of a same-size model. How much of that gap is the architecture versus the fact that the reasoning model writes far more output tokens by design?
2. Compression at 16x is for video and 4x is for fine OCR detail. How would you decide per request which mode an agent should pick, and what does guessing wrong cost you in accuracy?
3. The model scores 13 on an aggregate intelligence index, roughly a quarter of a frontier model. Where does that accuracy ceiling actually bite for invoice or table extraction at scale?
4. Running vision locally keeps documents off a hosted API. What does that change for a team handling regulated data like medical receipts or financial statements?
5. If a 1.3B model can read most of your documents, where does the hidden cost move: GPU memory, the eval harness, or the fine-tuning you need for your own document types?
#shorts #VisionModel #MiniCPM #dataengineering #aiforanalysts
```
Видео Why this 1.3B vision model uses 40x fewer tokens #llm #aiforanalysts #dataengineering #ai #aiagents канала JH-Analytics | 2.0
1. The 43x token saving is measured against the reasoning variant of a same-size model. How much of that gap is the architecture versus the fact that the reasoning model writes far more output tokens by design?
2. Compression at 16x is for video and 4x is for fine OCR detail. How would you decide per request which mode an agent should pick, and what does guessing wrong cost you in accuracy?
3. The model scores 13 on an aggregate intelligence index, roughly a quarter of a frontier model. Where does that accuracy ceiling actually bite for invoice or table extraction at scale?
4. Running vision locally keeps documents off a hosted API. What does that change for a team handling regulated data like medical receipts or financial statements?
5. If a 1.3B model can read most of your documents, where does the hidden cost move: GPU memory, the eval harness, or the fine-tuning you need for your own document types?
#shorts #VisionModel #MiniCPM #dataengineering #aiforanalysts
```
Видео Why this 1.3B vision model uses 40x fewer tokens #llm #aiforanalysts #dataengineering #ai #aiagents канала JH-Analytics | 2.0
minicpm-v minicpm-v 4.6 vision language model vlm small vision model openbmb siglip2 qwen vision token efficiency on device ai edge ai local agent ocr model document ai pdf extraction visual token compression artificial analysis apache 2 license open weights vision ai for data ai for analysts data engineering ai shorts ai news multimodal model
Комментарии отсутствуют
Информация о видео
3 ч. 22 мин. назад
00:01:10
Другие видео канала





















