How Diffusion LLM Generates Tokens - Code Explained
How Diffusion LLM Generates Tokens - Code Explained
Github - https://github.com/OpenMOSS/LongLLaDA/tree/main
- How LLaDA generates text by refining [MASK] tokens over several steps.
- The structure of inputs: prompt + [MASK] tokens to be filled.
- How the model chooses the most confident predictions at each step.
- Why it uses softmax + Gumbel noise for stochastic sampling.
- How gradual top-k selection improves quality over greedy decoding.
Code DeepSeek V3 From Scratch Full Course - https://www.youtube.com/watch?v=TfEG0TwueTs&list=PL-9_KFQd8ssI_-_lNLXNRIUavVgPW5Kgr&index=7
support me on patreon - https://www.patreon.com/vukrosic/membership
contact: vukrosic1@gmail.com
If you wish to fund my research, contact me via email.
0:00 - How Diffusion LLMs Predict Tokens
0:25 - The Iterative Refinement Process
1:21 - Calculating Token Probabilities
2:11 - Adding Randomness to Predictions
3:39 - Keeping High-Confidence Tokens
5:00 - Code Overview: Ladder Generate
6:23 - Denoising Example: "The Cat Sat..."
8:05 - Step 1: Generating Logits
10:09 - Step 2: Sampling with Gumbel Noise
12:11 - Step 3: Calculating Confidence with torch.gather
18:05 - Low-Confidence Re-masking Strategy
21:26 - Block-by-Block Generation
25:03 - Updating Only Masked Tokens
29:53 - The Full Refinement Loop
33:34 - The Token Transfer Schedule
Видео How Diffusion LLM Generates Tokens - Code Explained канала Vuk Rosić
Github - https://github.com/OpenMOSS/LongLLaDA/tree/main
- How LLaDA generates text by refining [MASK] tokens over several steps.
- The structure of inputs: prompt + [MASK] tokens to be filled.
- How the model chooses the most confident predictions at each step.
- Why it uses softmax + Gumbel noise for stochastic sampling.
- How gradual top-k selection improves quality over greedy decoding.
Code DeepSeek V3 From Scratch Full Course - https://www.youtube.com/watch?v=TfEG0TwueTs&list=PL-9_KFQd8ssI_-_lNLXNRIUavVgPW5Kgr&index=7
support me on patreon - https://www.patreon.com/vukrosic/membership
contact: vukrosic1@gmail.com
If you wish to fund my research, contact me via email.
0:00 - How Diffusion LLMs Predict Tokens
0:25 - The Iterative Refinement Process
1:21 - Calculating Token Probabilities
2:11 - Adding Randomness to Predictions
3:39 - Keeping High-Confidence Tokens
5:00 - Code Overview: Ladder Generate
6:23 - Denoising Example: "The Cat Sat..."
8:05 - Step 1: Generating Logits
10:09 - Step 2: Sampling with Gumbel Noise
12:11 - Step 3: Calculating Confidence with torch.gather
18:05 - Low-Confidence Re-masking Strategy
21:26 - Block-by-Block Generation
25:03 - Updating Only Masked Tokens
29:53 - The Full Refinement Loop
33:34 - The Token Transfer Schedule
Видео How Diffusion LLM Generates Tokens - Code Explained канала Vuk Rosić
llada masked token prediction llm mask denoising diffusion llama diffusion attention iterative token refinement llm gumbel noise llm sampling torch where mask tokens llama3 mask prediction model llm stepwise decoding diffusion style text generation masked language modeling example topk token selection llm llama token confidence ranking classifier free guidance llm how llada model works llm prompt and mask strategy
Комментарии отсутствуют
Информация о видео
26 июня 2025 г. 1:55:43
00:35:49
Другие видео канала