AI ExplorerAI Explorer
OutilsCatégoriesSitesLLMsComparerQuiz IAAlternativesPremium

—

Outils IA

—

Sites & Blogs

—

LLMs & Modèles

—

Catégories

AI Explorer

Trouvez et comparez les meilleurs outils d'intelligence artificielle pour vos projets.

Fait avecen France

Explorer

  • Tous les outils
  • Sites & Blogs
  • LLMs & Modèles
  • Comparer
  • Chatbots
  • Images IA
  • Code & Dev

Entreprise

  • Premium
  • À propos
  • Contact
  • Blog

Légal

  • Mentions légales
  • Confidentialité
  • CGV

© 2026 AI Explorer. Tous droits réservés.

AccueilLLMsDeepSeek V3.2 Exp

DeepSeek V3.2 Exp

par deepseek-ai

Open source · 191k downloads · 981 likes

3.7
(981 avis)ChatAPI & Local
À propos

DeepSeek V3.2 Exp est une version expérimentale du modèle DeepSeek, conçue comme une étape intermédiaire vers une nouvelle architecture. Il intègre une innovation majeure : le *DeepSeek Sparse Attention*, un mécanisme d'attention éparse qui optimise l'efficacité des calculs lors du traitement de longs contextes textuels, sans compromettre la qualité des réponses. Le modèle conserve des performances comparables à ses prédécesseurs tout en réduisant significativement les coûts de calcul, tant pour l'entraînement que pour l'inférence. Idéal pour les applications nécessitant une gestion avancée de longues séquences, comme l'analyse de documents étendus ou les conversations prolongées, il se distingue par son approche pionnière en matière d'attention éparse. Cette version reflète l'engagement de DeepSeek à repousser les limites des architectures de transformers plus efficaces.

Documentation

DeepSeek-V3.2-Exp

DeepSeek-V3

Homepage Chat Hugging Face
Discord Wechat Twitter Follow
License

Introduction

We are excited to announce the official release of DeepSeek-V3.2-Exp, an experimental version of our model. As an intermediate step toward our next-generation architecture, V3.2-Exp builds upon V3.1-Terminus by introducing DeepSeek Sparse Attention—a sparse attention mechanism designed to explore and validate optimizations for training and inference efficiency in long-context scenarios.

This experimental release represents our ongoing research into more efficient transformer architectures, particularly focusing on improving computational efficiency when processing extended text sequences.

  • DeepSeek Sparse Attention (DSA) achieves fine-grained sparse attention for the first time, delivering substantial improvements in long-context training and inference efficiency while maintaining virtually identical model output quality.

  • To rigorously evaluate the impact of introducing sparse attention, we deliberately aligned the training configurations of DeepSeek-V3.2-Exp with V3.1-Terminus. Across public benchmarks in various domains, DeepSeek-V3.2-Exp demonstrates performance on par with V3.1-Terminus.

BenchmarkDeepSeek-V3.1-TerminusDeepSeek-V3.2-Exp
Reasoning Mode w/o Tool Use
MMLU-Pro85.085.0
GPQA-Diamond80.779.9
Humanity's Last Exam21.719.8
LiveCodeBench74.974.1
AIME 202588.489.3
HMMT 202586.183.6
Codeforces20462121
Aider-Polyglot76.174.5
Agentic Tool Use
BrowseComp38.540.1
BrowseComp-zh45.047.9
SimpleQA96.897.1
SWE Verified68.467.8
SWE-bench Multilingual57.857.9
Terminal-bench36.737.7

Update

  • 2025.11.17: We have identified that previous versions of the inference demo code contained an implementation discrepancy in Rotary Position Embedding (RoPE) within the indexer module, potentially leading to degraded model performance. Specifically, the input tensor to RoPE in the indexer module requires a non-interleaved layout, whereas RoPE in the MLA module expects an interleaved layout. This issue has now been resolved. Please refer to the updated version of the inference demo code and take note of this implementation detail.

How to Run Locally

HuggingFace

We provide an updated inference demo code in the inference folder to help the community quickly get started with our model and understand its architectural details.

First convert huggingface model weights to the the format required by our inference demo. Set MP to match your available GPU count:

Bash
cd inference
export EXPERTS=256
python convert.py --hf-ckpt-path ${HF_CKPT_PATH} --save-path ${SAVE_PATH} --n-experts ${EXPERTS} --model-parallel ${MP}

Launch the interactive chat interface and start exploring DeepSeek's capabilities:

Bash
export CONFIG=config_671B_v3.2.json
torchrun --nproc-per-node ${MP} generate.py --ckpt-path ${SAVE_PATH} --config ${CONFIG} --interactive

SGLang

Installation with Docker

Bash
# H200
docker pull lmsysorg/sglang:dsv32

# MI350
docker pull lmsysorg/sglang:dsv32-rocm

# NPUs
docker pull lmsysorg/sglang:dsv32-a2
docker pull lmsysorg/sglang:dsv32-a3

Launch Command

Bash
python -m sglang.launch_server --model deepseek-ai/DeepSeek-V3.2-Exp --tp 8 --dp 8 --enable-dp-attention

vLLM

vLLM provides day-0 support of DeepSeek-V3.2-Exp. See the recipes for up-to-date details.

Open-Source Kernels

For TileLang kernels with better readability and research-purpose design, please refer to TileLang.

For high-performance CUDA kernels, indexer logit kernels (including paged versions) are available in DeepGEMM. Sparse attention kernels are released in FlashMLA.

License

This repository and the model weights are licensed under the MIT License.

Citation

INI
@misc{deepseekai2024deepseekv32,
      title={DeepSeek-V3.2-Exp: Boosting Long-Context Efficiency with DeepSeek Sparse Attention}, 
      author={DeepSeek-AI},
      year={2025},
}

Contact

If you have any questions, please raise an issue or contact us at [email protected].

Liens & Ressources
Spécifications
CatégorieChat
AccèsAPI & Local
LicenceOpen Source
TarificationOpen Source
Note
3.7

Essayer DeepSeek V3.2 Exp

Accédez directement au modèle