AI ExplorerAI Explorer
OutilsCatégoriesSitesLLMsComparerQuiz IAAlternativesPremium

—

Outils IA

—

Sites & Blogs

—

LLMs & Modèles

—

Catégories

AI Explorer

Trouvez et comparez les meilleurs outils d'intelligence artificielle pour vos projets.

Fait avecen France

Explorer

  • Tous les outils
  • Sites & Blogs
  • LLMs & Modèles
  • Comparer
  • Chatbots
  • Images IA
  • Code & Dev

Entreprise

  • Premium
  • À propos
  • Contact
  • Blog

Légal

  • Mentions légales
  • Confidentialité
  • CGV

© 2026 AI Explorer. Tous droits réservés.

AccueilLLMsmdbr leaf mt

mdbr leaf mt

par MongoDB

Open source · 23k downloads · 26 likes

1.8
(26 avis)EmbeddingAPI & Local
À propos

Le modèle **mdbr-leaf-mt** est un modèle d'embeddings textuels compact et performant, spécialement conçu pour des tâches comme la classification, le clustering, la similarité sémantique entre phrases et la summarisation. Il se distingue par son efficacité, supportant des architectures asymétriques et des techniques de compression comme la quantification vectorielle ou la troncature MRL, tout en maintenant des performances de pointe. Sur le benchmark MTEB v2 (anglais), il occupe la première place parmi les modèles de moins de 30 millions de paramètres, démontrant une qualité supérieure pour la recherche d'information et les applications nécessitant des embeddings robustes. Développé par l'équipe de recherche en IA de MongoDB, il est optimisé pour des usages variés, des systèmes de recommandation aux pipelines de traitement automatique du langage. Son approche innovante, détaillée dans un rapport technique, en fait un outil polyvalent pour les développeurs et chercheurs en traitement du langage naturel.

Documentation
MongoDB Logo MongoDB/mdbr-leaf-mt

Content

  1. Introduction
  2. Technical Report
  3. Highlights
  4. Benchmarks
  5. Quickstart
  6. Citation

Introduction

mdbr-leaf-mt is a compact high-performance text embedding model designed for classification, clustering, semantic sentence similarity and summarization tasks.

To enable even greater efficiency, mdbr-leaf-mt supports flexible asymmetric architectures and is robust to vector quantization and MRL truncation.

If you are looking to perform semantic search / information retrieval (e.g. for RAGs), please check out our mdbr-leaf-ir model, which is specifically trained for these tasks.

[!Note]
Note: this model has been developed by the ML team of MongoDB Research. At the time of writing it is not used in any of MongoDB's commercial product or service offerings.

Technical Report

A technical report detailing our proposed LEAF training procedure is available here.

Highlights

  • State-of-the-Art Performance: mdbr-leaf-mt achieves new state-of-the-art results for compact embedding models, ranking #1 on the public MTEB v2 (Eng) benchmark leaderboard for models with ≤30M parameters.
  • Flexible Architecture Support: mdbr-leaf-mt supports asymmetric retrieval architectures enabling even greater retrieval results. See below for more information.
  • MRL and Quantization Support: embedding vectors generated by mdbr-leaf-mt compress well when truncated (MRL) and can be stored using more efficient types like int8 and binary. See below for more information.

Benchmark Comparison

The table below shows the scores for mdbr-leaf-mt on the MTEB v2 (English) benchmark, compared to other retrieval models.

mdbr-leaf-mt ranks #1 on this benchmark for models with <30M parameters.

ModelSizeMTEB v2 (Eng)
OpenAI text-embedding-3-largeUnknown66.43
OpenAI text-embedding-3-smallUnknown64.56
mdbr-leaf-mt23M63.97
gte-small33M63.22
snowflake-arctic-embed-s32M61.59
e5-small-v233M61.32
granite-embedding-small-english-r247M61.07
all-MiniLM-L6-v222M59.03

Quickstart

Sentence Transformers

Python
from sentence_transformers import SentenceTransformer  
  
# Load the model  
model = SentenceTransformer("MongoDB/mdbr-leaf-mt")  
  
# Example queries and documents  
queries = [
    "What is machine learning?",  
    "How does neural network training work?"  
]  
  
documents = [  
    "Machine learning is a subset of artificial intelligence that focuses on algorithms that can learn from data.",  
    "Neural networks are trained through backpropagation, adjusting weights to minimize prediction errors."  
]  
  
# Encode queries and documents  
query_embeddings = model.encode(queries, prompt_name="query")  
document_embeddings = model.encode(documents)  
  
# Compute similarity scores  
scores = model.similarity(query_embeddings, document_embeddings)  

# Print results
for i, query in enumerate(queries):
    print(f"Query: {query}")
    for j, doc in enumerate(documents):
        print(f" Similarity: {scores[i, j]:.4f} | Document {j}: {doc[:80]}...")
See example output
YAML
Query: What is machine learning?
 Similarity: 0.9063 | Document 0: Machine learning is a subset of ...
 Similarity: 0.7287 | Document 1: Neural networks are trained ...

Query: How does neural network training work?
 Similarity: 0.6725 | Document 0: Machine learning is a subset of ...
 Similarity: 0.8287 | Document 1: Neural networks are trained ...

Transformers.js

If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

Bash
npm i @huggingface/transformers

You can then use the model to compute embeddings like this:

Js
import { AutoModel, AutoTokenizer, matmul } from "@huggingface/transformers";

// Download from the 🤗 Hub
const model_id = "MongoDB/mdbr-leaf-mt";
const tokenizer = await AutoTokenizer.from_pretrained(model_id);
const model = await AutoModel.from_pretrained(model_id, {
    dtype: "fp32", // Options: "fp32" | "fp16" | "q8" | "q4" | "q4f16"
});

// Prepare queries and documents
const queries = [
    "What is machine learning?",
    "How does neural network training work?",
];
const documents = [  
    "Machine learning is a subset of artificial intelligence that focuses on algorithms that can learn from data.",
    "Neural networks are trained through backpropagation, adjusting weights to minimize prediction errors.",
];
const inputs = await tokenizer([
    ...queries.map((x) => "Represent this sentence for searching relevant passages: " + x),
    ...documents,
], { padding: true });

// Generate embeddings
const { sentence_embedding } = await model(inputs);
const normalized_sentence_embedding = sentence_embedding.normalize();

// Compute similarities
const scores = await matmul(
    normalized_sentence_embedding.slice([0, queries.length]),
    normalized_sentence_embedding.slice([queries.length, null]).transpose(1, 0),
);
const scores_list = scores.tolist();

for (let i = 0; i < queries.length; ++i) {
    console.log(`Query: ${queries[i]}`);
    for (let j = 0; j < documents.length; ++j) {
        console.log(` Similarity: ${scores_list[i][j].toFixed(4)} | Document ${j}: ${documents[j]}`);
    }
    console.log();
}
See example output
VB.NET
Query: What is machine learning?
 Similarity: 0.9063 | Document 0: Machine learning is a subset of artificial intelligence that focuses on algorithms that can learn from data.
 Similarity: 0.7287 | Document 1: Neural networks are trained through backpropagation, adjusting weights to minimize prediction errors.

Query: How does neural network training work?
 Similarity: 0.6725 | Document 0: Machine learning is a subset of artificial intelligence that focuses on algorithms that can learn from data.
 Similarity: 0.8287 | Document 1: Neural networks are trained through backpropagation, adjusting weights to minimize prediction errors.

Transformers Usage

See here.

Asymmetric Retrieval Setup

[!Note] Note: a version of this asymmetric setup, conveniently packaged into a single model, is available here.

mdbr-leaf-mt is aligned to mxbai-embed-large-v1, the model it has been distilled from, making the asymmetric system below possible:

Python
# Use mdbr-leaf-mt for query encoding (real-time, low latency)  
query_model = SentenceTransformer("MongoDB/mdbr-leaf-mt")  
query_embeddings = query_model.encode(queries, prompt_name="query")  

# Use a larger model for document encoding (one-time, at index time)  
doc_model = SentenceTransformer("mixedbread-ai/mxbai-embed-large-v1")  
document_embeddings = doc_model.encode(documents)  
  
# Compute similarities  
scores = query_model.similarity(query_embeddings, document_embeddings)  

Retrieval results from asymmetric mode are usually superior to the standard mode above.

MRL Truncation

Embeddings have been trained via MRL and can be truncated for more efficient storage:

Python
query_embeds = model.encode(queries, prompt_name="query", truncate_dim=256)
doc_embeds = model.encode(documents, truncate_dim=256)

similarities = model.similarity(query_embeds, doc_embeds)

print('After MRL:')
print(f"* Embeddings dimension: {query_embeds.shape[1]}")
print(f"* Similarities: \n\t{similarities}")
See example output
Lua
After MRL:
* Embeddings dimension: 256
* Similarities:
    tensor([[0.9164, 0.7219],
            [0.6682, 0.8393]], device='cuda:0')

Vector Quantization

Vector quantization, for example to int8 or binary, can be performed as follows:

Note: For vector quantization to types other than binary, we suggest performing a calibration to determine the optimal ranges, see here. Good initial values are -1.0 and +1.0.

Python
from sentence_transformers.quantization import quantize_embeddings
import torch

query_embeds = model.encode(queries, prompt_name="query")
doc_embeds = model.encode(documents)

# Quantize embeddings to int8 using -1.0 and +1.0
ranges = torch.tensor([[-1.0], [+1.0]]).expand(2, query_embeds.shape[1]).cpu().numpy()
query_embeds = quantize_embeddings(query_embeds, "int8", ranges=ranges)
doc_embeds = quantize_embeddings(doc_embeds, "int8", ranges=ranges)

# Calculate similarities; cast to int64 to avoid under/overflow
similarities = query_embeds.astype(int) @ doc_embeds.astype(int).T

print('After quantization:')
print(f"* Embeddings type: {query_embeds.dtype}")
print(f"* Similarities: \n{similarities}")
See example output
Lua
After quantization:
* Embeddings type: int8
* Similarities:
   [[2202032 1422868]
    [1421197 1845580]]

Evaluation

Please see here.

Citation

If you use this model in your work, please cite:

Bibtex
@misc{mdbr_leaf,
      title={LEAF: Knowledge Distillation of Text Embedding Models with Teacher-Aligned Representations}, 
      author={Robin Vujanic and Thomas Rueckstiess},
      year={2025},
      eprint={2509.12539},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2509.12539}, 
}

License

This model is released under Apache 2.0 License.

Contact

For questions or issues, please open an issue or pull request. You can also contact the MongoDB ML Research team at [email protected].

Liens & Ressources
Spécifications
CatégorieEmbedding
AccèsAPI & Local
LicenceOpen Source
TarificationOpen Source
Note
1.8

Essayer mdbr leaf mt

Accédez directement au modèle