par deepseek-ai
Open source · 151k downloads · 490 likes
DeepSeek Coder 6.7B Instruct est un modèle d'intelligence artificielle spécialisé dans la génération et la compréhension de code, optimisé pour assister les développeurs dans leurs tâches quotidiennes. Entraîné sur un vaste corpus mêlant 87 % de code et 13 % de texte en anglais et en chinois, il excelle dans la complétion et l'inférence de code à l'échelle de projets entiers, grâce à une fenêtre de contexte de 16 000 tokens. Ce modèle se distingue par ses performances de pointe sur des benchmarks reconnus, offrant une alternative puissante et flexible aux solutions propriétaires, avec des tailles adaptables selon les besoins. Il est particulièrement utile pour automatiser des tâches répétitives, générer des extraits de code, ou même résoudre des problèmes complexes en programmation. Son approche open-source et sa licence commerciale en font un outil accessible et polyvalent pour les professionnels comme pour les passionnés.
[🏠Homepage] | [🤖 Chat with DeepSeek Coder] | [Discord] | [Wechat(微信)]
Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on project-level code corpus by employing a window size of 16K and a extra fill-in-the-blank task, to support project-level code completion and infilling. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various benchmarks.
Massive Training Data: Trained from scratch fon 2T tokens, including 87% code and 13% linguistic data in both English and Chinese languages.
Highly Flexible & Scalable: Offered in model sizes of 1.3B, 5.7B, 6.7B, and 33B, enabling users to choose the setup most suitable for their requirements.
Superior Model Performance: State-of-the-art performance among publicly available code models on HumanEval, MultiPL-E, MBPP, DS-1000, and APPS benchmarks.
Advanced Code Completion Capabilities: A window size of 16K and a fill-in-the-blank task, supporting project-level code completion and infilling tasks.
deepseek-coder-6.7b-instruct is a 6.7B parameter model initialized from deepseek-coder-6.7b-base and fine-tuned on 2B tokens of instruction data.
Here give some examples of how to use our model.
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-coder-6.7b-instruct", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-coder-6.7b-instruct", trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()
messages=[
{ 'role': 'user', 'content': "write a quick sort algorithm in python."}
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
# tokenizer.eos_token_id is the id of <|EOT|> token
outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, top_k=50, top_p=0.95, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
print(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))
This code repository is licensed under the MIT License. The use of DeepSeek Coder models is subject to the Model License. DeepSeek Coder supports commercial use.
See the LICENSE-MODEL for more details.
If you have any questions, please raise an issue or contact us at [email protected].