AI/EXPLORER
ToolsCategoriesSitesLLMsCompareAI QuizAlternativesPremium
—AI Tools
—Sites & Blogs
—LLMs & Models
—Categories
AI Explorer

Find and compare the best artificial intelligence tools for your projects.

Made within France

Explore

  • ›All tools
  • ›Sites & Blogs
  • ›LLMs & Models
  • ›Compare
  • ›Chatbots
  • ›AI Images
  • ›Code & Dev

Company

  • ›Premium
  • ›About
  • ›Contact
  • ›Blog

Legal

  • ›Legal notice
  • ›Privacy
  • ›Terms

© 2026 AI Explorer·All rights reserved.

HomeLLMscodebert base

codebert base

by microsoft

Open source · 296k downloads · 284 likes

3.1
(284 reviews)EmbeddingAPI & Local
About

CodeBERT-base is a pre-trained language model specifically designed to understand and generate both computer code and natural language text. It excels in tasks such as code search, generating documentation from code, and code completion by leveraging a joint understanding of both types of data. Its unique approach, combining masked language modeling and discriminative objectives for real tokens, enables it to capture complex relationships between programming languages and textual descriptions. The model stands out for its versatility, capable of processing multiple programming languages while maintaining high performance. It is particularly useful for developers and researchers looking to automate code-related tasks or enhance programming assistance tools.

Documentation

CodeBERT-base

Pretrained weights for CodeBERT: A Pre-Trained Model for Programming and Natural Languages.

Training Data

The model is trained on bi-modal data (documents & code) of CodeSearchNet

Training Objective

This model is initialized with Roberta-base and trained with MLM+RTD objective (cf. the paper).

Usage

Please see the official repository for scripts that support "code search" and "code-to-document generation".

Reference

  1. CodeBERT trained with Masked LM objective (suitable for code completion)
  2. 🤗 Hugging Face's CodeBERTa (small size, 6 layers)

Citation

Bibtex
@misc{feng2020codebert,
    title={CodeBERT: A Pre-Trained Model for Programming and Natural Languages},
    author={Zhangyin Feng and Daya Guo and Duyu Tang and Nan Duan and Xiaocheng Feng and Ming Gong and Linjun Shou and Bing Qin and Ting Liu and Daxin Jiang and Ming Zhou},
    year={2020},
    eprint={2002.08155},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Capabilities & Tags
transformerspytorchtfjaxrustrobertafeature-extractionendpoints_compatible
Links & Resources
Specifications
CategoryEmbedding
AccessAPI & Local
LicenseOpen Source
PricingOpen Source
Rating
3.1

Try codebert base

Access the model directly