(FREE) How To Create The Video of Cooking Cat with ComfyUI, Flux and Wan2.1
Hey everyone!
Today, I’m going to show you how to create a viral video about a cat that can cook, completely free!
No need to pay for fancy AI tools like Kling AI, Hailuo, or Runway. All you need is a computer and a bit of creativity!
For this tutorial, I’ll be using ComfyUI along with Flux, Wan2.1, and GPT to generate the video.
Step 1: Setting Up ComfyUI.
First, install ComfyUI using the link on the screen. Then, go to settings and update it. Load the workflow, download the Flux and Wan2.1 models from the links provided at the end of this video, install any missing nodes, and you’re ready to create!
Step 2: Generating Images with AI.
Our video will feature a cat frying an egg.
I used ChatGPT to generate six prompts covering the entire cooking process:
Choosing an egg, Whisking the eggs, adding oil to the pan, frying the eggs, Garnishing the fried eggs on a Plate and finally Eating the delicious fried egg.
In just seconds, GPT provided six perfect prompts!
Step 3: Creating the Video.
Using ComfyUI, I built a workflow consisting of:
Image generation with Flux
Image-to-video conversion with Wan2.1
Final upscaling and smoothing with Model Upscale & RIFE VFI
1. Generating Images with Flux
I used the Flux1-dev model with DualClip and boosted realism using Realistic LoRA in Group Node 1. Then, I inputted the first prompt from GPT into Group Node 2, setting the resolution to 1024x576.
Within seconds, Group Node 3 using Xlabs Sampler + VAE produced a crisp image of the cat selecting an egg. You can preview it in Group Node 4.
2. Converting Images to Video.
In Group Node 5, I loaded Wan2.1 Diffusion Model with ClipVision.
The cat image generated from Flux was processed through Group Node 6, where I matched the positive prompt with the original from Flux and added a negative prompt to avoid unwanted details.
I set the resolution to 1024x768, with 49 frames, enough for creating a 6-second clip.
After waiting 10 minutes on an GPU RTX 4090, the result appeared in Group Node 7, but it wasn’t perfect yet. The video looked a bit choppy and low-res, mainly because I had only set the frame rate to 12 and the resolution to 1024x576.
3. Upscaling & Smoothing.
To fix this:
In Group Node 8, I upscaled all frames to Full HD 1920x1080.
Then, I tripled the frame rate to 24fps using RIFE VFI with multiplier set to 3.
And voilà! A smooth, sharp, Full HD quality video, all done without paid AI tools!
Repeating this for the remaining five prompts, I ended up with six high-quality video clips, just as impressive as those made with Kling AI, Haluo, or Runway.
Step 4: Adding Sound Effects.
For sound, I searched YouTube with keywords like:
"Kitchen Ambience", "Pick up things", "Pouring drink into cup", "Stirring big pot of water", "Eggs frying", "Seasoning shake", "Cat eating".
Then, I downloaded the sound effects using y2mate.com.
Step 5: Editing the Video.
Now that we have the visuals and audio, it’s time to edit the video using the free software CapCut.
Simply drag and drop the footage and sound effects, add a fitting background track, tweak the clip durations, and export the final video in FHD.
And here’s the finished product!
Hope you guys found this tutorial helpful. Happy creating, and see you in the next video!
------------------------------------------------
Links:
- Workflow and Resources:
https://drive.google.com/drive/folders/1840puXdwEdMu4hNUHfpY5vYLLnvgB2wF?usp=sharing
- Flux model flux1-dev:
https://huggingface.co/realung/flux1-dev.safetensors/tree/main
- Dual clip models:
clip_I.safetensors (download to text encoder)
https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/clip_l.safetensors
t5xxl_fp16.safetensors (download to text encoder)
https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/t5xxl_fp16.safetensors
- Flux realism lora:
https://huggingface.co/XLabs-AI/flux-RealismLora/tree/main
- VAE:
https://huggingface.co/lovis93/testllm/tree/main
- Wan2.1 model (download to diffusion_models)
https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/diffusion_models/wan2.1_i2v_720p_14B_fp16.safetensors
- Wan Clip:
https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/t5xxl_fp8_e4m3fn.safetensors
- Wan clip vision h:
https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/clip_vision/clip_vision_h.safetensors
- Wan 2.1 VAE:
https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/vae/wan_2.1_vae.safetensors
Видео (FREE) How To Create The Video of Cooking Cat with ComfyUI, Flux and Wan2.1 канала FutuTek
Today, I’m going to show you how to create a viral video about a cat that can cook, completely free!
No need to pay for fancy AI tools like Kling AI, Hailuo, or Runway. All you need is a computer and a bit of creativity!
For this tutorial, I’ll be using ComfyUI along with Flux, Wan2.1, and GPT to generate the video.
Step 1: Setting Up ComfyUI.
First, install ComfyUI using the link on the screen. Then, go to settings and update it. Load the workflow, download the Flux and Wan2.1 models from the links provided at the end of this video, install any missing nodes, and you’re ready to create!
Step 2: Generating Images with AI.
Our video will feature a cat frying an egg.
I used ChatGPT to generate six prompts covering the entire cooking process:
Choosing an egg, Whisking the eggs, adding oil to the pan, frying the eggs, Garnishing the fried eggs on a Plate and finally Eating the delicious fried egg.
In just seconds, GPT provided six perfect prompts!
Step 3: Creating the Video.
Using ComfyUI, I built a workflow consisting of:
Image generation with Flux
Image-to-video conversion with Wan2.1
Final upscaling and smoothing with Model Upscale & RIFE VFI
1. Generating Images with Flux
I used the Flux1-dev model with DualClip and boosted realism using Realistic LoRA in Group Node 1. Then, I inputted the first prompt from GPT into Group Node 2, setting the resolution to 1024x576.
Within seconds, Group Node 3 using Xlabs Sampler + VAE produced a crisp image of the cat selecting an egg. You can preview it in Group Node 4.
2. Converting Images to Video.
In Group Node 5, I loaded Wan2.1 Diffusion Model with ClipVision.
The cat image generated from Flux was processed through Group Node 6, where I matched the positive prompt with the original from Flux and added a negative prompt to avoid unwanted details.
I set the resolution to 1024x768, with 49 frames, enough for creating a 6-second clip.
After waiting 10 minutes on an GPU RTX 4090, the result appeared in Group Node 7, but it wasn’t perfect yet. The video looked a bit choppy and low-res, mainly because I had only set the frame rate to 12 and the resolution to 1024x576.
3. Upscaling & Smoothing.
To fix this:
In Group Node 8, I upscaled all frames to Full HD 1920x1080.
Then, I tripled the frame rate to 24fps using RIFE VFI with multiplier set to 3.
And voilà! A smooth, sharp, Full HD quality video, all done without paid AI tools!
Repeating this for the remaining five prompts, I ended up with six high-quality video clips, just as impressive as those made with Kling AI, Haluo, or Runway.
Step 4: Adding Sound Effects.
For sound, I searched YouTube with keywords like:
"Kitchen Ambience", "Pick up things", "Pouring drink into cup", "Stirring big pot of water", "Eggs frying", "Seasoning shake", "Cat eating".
Then, I downloaded the sound effects using y2mate.com.
Step 5: Editing the Video.
Now that we have the visuals and audio, it’s time to edit the video using the free software CapCut.
Simply drag and drop the footage and sound effects, add a fitting background track, tweak the clip durations, and export the final video in FHD.
And here’s the finished product!
Hope you guys found this tutorial helpful. Happy creating, and see you in the next video!
------------------------------------------------
Links:
- Workflow and Resources:
https://drive.google.com/drive/folders/1840puXdwEdMu4hNUHfpY5vYLLnvgB2wF?usp=sharing
- Flux model flux1-dev:
https://huggingface.co/realung/flux1-dev.safetensors/tree/main
- Dual clip models:
clip_I.safetensors (download to text encoder)
https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/clip_l.safetensors
t5xxl_fp16.safetensors (download to text encoder)
https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/t5xxl_fp16.safetensors
- Flux realism lora:
https://huggingface.co/XLabs-AI/flux-RealismLora/tree/main
- VAE:
https://huggingface.co/lovis93/testllm/tree/main
- Wan2.1 model (download to diffusion_models)
https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/diffusion_models/wan2.1_i2v_720p_14B_fp16.safetensors
- Wan Clip:
https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/t5xxl_fp8_e4m3fn.safetensors
- Wan clip vision h:
https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/clip_vision/clip_vision_h.safetensors
- Wan 2.1 VAE:
https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/vae/wan_2.1_vae.safetensors
Видео (FREE) How To Create The Video of Cooking Cat with ComfyUI, Flux and Wan2.1 канала FutuTek
Комментарии отсутствуют
Информация о видео
10 марта 2025 г. 21:40:01
00:04:37
Другие видео канала













