- Популярные видео
- Авто
- Видео-блоги
- ДТП, аварии
- Для маленьких
- Еда, напитки
- Животные
- Закон и право
- Знаменитости
- Игры
- Искусство
- Комедии
- Красота, мода
- Кулинария, рецепты
- Люди
- Мото
- Музыка
- Мультфильмы
- Наука, технологии
- Новости
- Образование
- Политика
- Праздники
- Приколы
- Природа
- Происшествия
- Путешествия
- Развлечения
- Ржач
- Семья
- Сериалы
- Спорт
- Стиль жизни
- ТВ передачи
- Танцы
- Технологии
- Товары
- Ужасы
- Фильмы
- Шоу-бизнес
- Юмор
ERNIE-Image vs Nucleus-Image — Which Is Better?
ERNIE-Image vs Nucleus-Image - Full AI Image Model Comparison & Review
In this video, we dive deep into the open-source text-to-image models: ERNIE-Image from Baidu and Nucleus-Image from Nucleus AI.
I break down their underlying architectures - from ERNIE's compact 8B parameter single-stream DiT to Nucleus-Image's massive 17B Sparse Mixture of Experts (MoE) setup. We'll put both models through a rigorous 11-prompt gauntlet testing text rendering, spatial awareness, complex anatomy, and more to see which one comes out on top!
What you’ll learn in this tutorial:
✅ Architectural breakdown of ERNIE-Image (8B DiT) and Nucleus-Image (Sparse MoE).
✅ Understanding benchmark scores (GENEval, DPG-Bench, etc.) and what they actually mean.
✅ Hardware requirements and how to run ERNIE-Image locally using GGUF quantization.
✅ Side-by-side prompt testing: Product photography, complex character placement, and spatial awareness.
✅ Stress-testing text generation: Handwritten notes, infographics, and UI/UX design mockups.
✅ Evaluating anatomy and complex poses—where do these models fall short?
Tools & Models Used:
ERNIE-Image (Baidu): Compact single-stream 8B DiT with a powerful prompt enhancer.
Nucleus-Image (Nucleus AI): Sparse MoE diffusion transformer (17B total / 2B active parameters).
ComfyUI: The ultimate node-based interface for running AI models locally.
Fal AI: Cloud inference provider used to run the massive unquantized Nucleus-Image model.
PC Specs:
Gpu: Nvidia RTX 5060 Ti 16 GB : https://amzn.to/4rU7xRy
Ram: 64gb 4x16gb Kingston Fury : https://amzn.to/473HoaG
Model Used :
ERNIE-Image Q8_0 GGUF
Nucleus-Image Base fp16
Pro Tip: When running ERNIE-Image locally in ComfyUI, using GGUF quantization (like Q8_0) can drastically reduce your VRAM requirements (down to ~8.6GB) without a massive hit to image quality, making it highly accessible for mid-range GPUs!
If you found this comparison helpful, don’t forget to Like, Subscribe, and Hit the Notification Bell for more deep dives into AI-powered design and models!
ig : https://www.instagram.com/kintugk/
x : https://x.com/gk_kintu
Contact: kintutech@gmail.com
Timestamps:
0:00 - Intro & Model Overview
0:50 - Benchmark Comparisons
1:46 - Architecture Breakdown (DiT vs Sparse MoE)
3:13 - Hugging Face, GGUF & VRAM Requirements
4:49 - Test 1: Luxury Product Photography
5:23 - Test 2: Spatial Awareness (Isometric Room)
5:53 - Test 3: Complex Character Generation (Subway)
7:04 - Test 4: Messy Handwritten Text
7:46 - Test 5: Retro VHS Tape Effect
8:39 - Test 6: Object Physics & Clocks
9:18 - Test 7: Comic Book Page Layout
10:21 - Test 8: Complex Anatomy (Yoga Pose)
11:13 - Test 9: Infographic Generation
12:18 - Test 10: App UI/UX Mockup
13:25 - Test 11: Movie Poster Design
15:13 - Final Thoughts & Conclusion
#ComfyUI #AIGeneratedArt #ERNIEImage #NucleusImage #ImageGeneration #AIWorkflow #StableDiffusion #MachineLearning #TechReview #AIArt
Видео ERNIE-Image vs Nucleus-Image — Which Is Better? канала kintu
In this video, we dive deep into the open-source text-to-image models: ERNIE-Image from Baidu and Nucleus-Image from Nucleus AI.
I break down their underlying architectures - from ERNIE's compact 8B parameter single-stream DiT to Nucleus-Image's massive 17B Sparse Mixture of Experts (MoE) setup. We'll put both models through a rigorous 11-prompt gauntlet testing text rendering, spatial awareness, complex anatomy, and more to see which one comes out on top!
What you’ll learn in this tutorial:
✅ Architectural breakdown of ERNIE-Image (8B DiT) and Nucleus-Image (Sparse MoE).
✅ Understanding benchmark scores (GENEval, DPG-Bench, etc.) and what they actually mean.
✅ Hardware requirements and how to run ERNIE-Image locally using GGUF quantization.
✅ Side-by-side prompt testing: Product photography, complex character placement, and spatial awareness.
✅ Stress-testing text generation: Handwritten notes, infographics, and UI/UX design mockups.
✅ Evaluating anatomy and complex poses—where do these models fall short?
Tools & Models Used:
ERNIE-Image (Baidu): Compact single-stream 8B DiT with a powerful prompt enhancer.
Nucleus-Image (Nucleus AI): Sparse MoE diffusion transformer (17B total / 2B active parameters).
ComfyUI: The ultimate node-based interface for running AI models locally.
Fal AI: Cloud inference provider used to run the massive unquantized Nucleus-Image model.
PC Specs:
Gpu: Nvidia RTX 5060 Ti 16 GB : https://amzn.to/4rU7xRy
Ram: 64gb 4x16gb Kingston Fury : https://amzn.to/473HoaG
Model Used :
ERNIE-Image Q8_0 GGUF
Nucleus-Image Base fp16
Pro Tip: When running ERNIE-Image locally in ComfyUI, using GGUF quantization (like Q8_0) can drastically reduce your VRAM requirements (down to ~8.6GB) without a massive hit to image quality, making it highly accessible for mid-range GPUs!
If you found this comparison helpful, don’t forget to Like, Subscribe, and Hit the Notification Bell for more deep dives into AI-powered design and models!
ig : https://www.instagram.com/kintugk/
x : https://x.com/gk_kintu
Contact: kintutech@gmail.com
Timestamps:
0:00 - Intro & Model Overview
0:50 - Benchmark Comparisons
1:46 - Architecture Breakdown (DiT vs Sparse MoE)
3:13 - Hugging Face, GGUF & VRAM Requirements
4:49 - Test 1: Luxury Product Photography
5:23 - Test 2: Spatial Awareness (Isometric Room)
5:53 - Test 3: Complex Character Generation (Subway)
7:04 - Test 4: Messy Handwritten Text
7:46 - Test 5: Retro VHS Tape Effect
8:39 - Test 6: Object Physics & Clocks
9:18 - Test 7: Comic Book Page Layout
10:21 - Test 8: Complex Anatomy (Yoga Pose)
11:13 - Test 9: Infographic Generation
12:18 - Test 10: App UI/UX Mockup
13:25 - Test 11: Movie Poster Design
15:13 - Final Thoughts & Conclusion
#ComfyUI #AIGeneratedArt #ERNIEImage #NucleusImage #ImageGeneration #AIWorkflow #StableDiffusion #MachineLearning #TechReview #AIArt
Видео ERNIE-Image vs Nucleus-Image — Which Is Better? канала kintu
Комментарии отсутствуют
Информация о видео
21 апреля 2026 г. 15:37:17
00:16:32
Другие видео канала

![Schematron 3B vs 8B: Local AI Web Scraping [Tested]](https://i.ytimg.com/vi/F__eg5cvS_A/default.jpg)

![LiteParse: 100% Local PDF & Document Parsing for AI Agents [Tested]](https://i.ytimg.com/vi/Bi3rlYidscs/default.jpg)





![MiniMax M2.7: The Model That Builds ITSELF [Tested]](https://i.ytimg.com/vi/3xc_0hyu6O0/default.jpg)










