FLUX.2: Frontier Visual Intelligence and Diffusers Integration

The collected texts introduce Black Forest Labs' release of FLUX.2, a new image generation and editing model built on an entirely new architecture that utilizes a single Mistral text encoder and an improved Diffusion Transformer (DiT). This model is engineered for professional creative uses, featuring capabilities such as generating highly detailed, photorealistic images up to 4MP resolution and supporting multi-reference inputs (up to 10 images) for consistent character and style. Although the raw model demands substantial resources, requiring over 80GB of VRAM, the accompanying documentation from Diffusers and GitHub provides crucial methods for running FLUX.2 [dev] on consumer hardware. These memory-saving optimizations include LoRA fine-tuning, CPU offloading, and employing quantization techniques like 4-bit loading, which is essential for making the powerful 32B open-weight variant accessible. Furthermore, the model supports advanced methods like structured JSON prompting and integrated image editing, positioning it as a frontier tool for visual intelligence.FLUX

#ai

Видео FLUX.2: Frontier Visual Intelligence and Diffusers Integration канала Vinh Nguyen

Комментарии отсутствуют