AI/EXPLORER
ToolsCategoriesSitesLLMsCompareAI QuizAlternativesPremium
—AI Tools
—Sites & Blogs
—LLMs & Models
—Categories
AI Explorer

Find and compare the best artificial intelligence tools for your projects.

Made within France

Explore

  • ›All tools
  • ›Sites & Blogs
  • ›LLMs & Models
  • ›Compare
  • ›Chatbots
  • ›AI Images
  • ›Code & Dev

Company

  • ›Premium
  • ›About
  • ›Contact
  • ›Blog

Legal

  • ›Legal notice
  • ›Privacy
  • ›Terms

© 2026 AI Explorer·All rights reserved.

HomeLLMscontriever msmarco

contriever msmarco

by facebook

Open source · 32k downloads · 33 likes

1.9
(33 reviews)EmbeddingAPI & Local
About

Contriever MS MARCO is a language model specialized in dense information retrieval, optimized to generate high-quality embeddings from text. It is a refined version of the pre-trained Contriever model, designed to enhance performance in unsupervised information retrieval tasks. Through its contrastive learning approach, it excels at capturing the semantic nuances of sentences, enabling precise matching between queries and documents. This model is particularly well-suited for advanced search systems, recommendation engines, or conversational assistants requiring a nuanced understanding of language. What sets it apart is its ability to operate effectively without labeled training data while delivering robust results across varied scenarios. Its architecture makes it versatile for applications where the relevance and accuracy of results are critical.

Documentation

This model is the finetuned version of the pre-trained contriever model available here https://huggingface.co/facebook/contriever, following the approach described in Towards Unsupervised Dense Information Retrieval with Contrastive Learning. The associated GitHub repository is available here https://github.com/facebookresearch/contriever.

Usage (HuggingFace Transformers)

Using the model directly available in HuggingFace transformers requires to add a mean pooling operation to obtain a sentence embedding.

Python
import torch
from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained('facebook/contriever-msmarco')
model = AutoModel.from_pretrained('facebook/contriever-msmarco')

sentences = [
    "Where was Marie Curie born?",
    "Maria Sklodowska, later known as Marie Curie, was born on November 7, 1867.",
    "Born in Paris on 15 May 1859, Pierre Curie was the son of Eugène Curie, a doctor of French Catholic origin from Alsace."
]

# Apply tokenizer
inputs = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')

# Compute token embeddings
outputs = model(**inputs)

# Mean pooling
def mean_pooling(token_embeddings, mask):
    token_embeddings = token_embeddings.masked_fill(~mask[..., None].bool(), 0.)
    sentence_embeddings = token_embeddings.sum(dim=1) / mask.sum(dim=1)[..., None]
    return sentence_embeddings
embeddings = mean_pooling(outputs[0], inputs['attention_mask'])
Capabilities & Tags
transformerspytorchbertfeature-extractiontext-embeddings-inferenceendpoints_compatible
Links & Resources
Specifications
CategoryEmbedding
AccessAPI & Local
LicenseOpen Source
PricingOpen Source
Rating
1.9

Try contriever msmarco

Access the model directly