by GSAI-ML
Open source · 202k downloads · 96 likes
LLaDA 8B Base is an advanced language model featuring 8 billion parameters, fully trained from scratch. Designed to rival models like LLaMA3 8B, it stands out for its high performance and optimized architecture. The model excels in generating coherent and natural text while offering versatility for a wide range of linguistic tasks. Its use cases include automated writing, translation, text data analysis, and conversational assistance. What sets it apart is its balance between power and efficiency, enabling seamless integration into applications requiring deep contextual understanding.
We introduce LLaDA, a diffusion model with an unprecedented 8B scale, trained entirely from scratch, rivaling LLaMA3 8B in performance.
[2025-10-21] We have modified modeling_llada.py to support the input of attention_mask.