AI ExplorerAI Explorer
OutilsCatégoriesSitesLLMsComparerQuiz IAAlternativesPremium

—

Outils IA

—

Sites & Blogs

—

LLMs & Modèles

—

Catégories

AI Explorer

Trouvez et comparez les meilleurs outils d'intelligence artificielle pour vos projets.

Fait avecen France

Explorer

  • Tous les outils
  • Sites & Blogs
  • LLMs & Modèles
  • Comparer
  • Chatbots
  • Images IA
  • Code & Dev

Entreprise

  • Premium
  • À propos
  • Contact
  • Blog

Légal

  • Mentions légales
  • Confidentialité
  • CGV

© 2026 AI Explorer. Tous droits réservés.

AccueilLLMsopensearch neural sparse encoding v2 distill

opensearch neural sparse encoding v2 distill

par opensearch-project

Open source · 670k downloads · 10 likes

1.3
(10 avis)EmbeddingAPI & Local
À propos

Le modèle *OpenSearch Neural Sparse Encoding v2 Distill* est un encodeur neuronal épars conçu pour améliorer la pertinence des recherches tout en optimisant l'efficacité d'inférence et de récupération. Il transforme les requêtes et les documents en vecteurs épars de haute dimension, où chaque dimension non nulle représente un token du vocabulaire et son poids reflète son importance dans la recherche. Entraîné sur un large éventail de jeux de données, il excelle notamment dans des scénarios de recherche sans fine-tuning préalable, surpassant les versions précédentes en termes de performance et de rapidité. Ce modèle est particulièrement adapté aux systèmes de recherche nécessitant une indexation efficace et une récupération précise, comme les moteurs de recherche d'entreprise ou les plateformes de questions-réponses. Son intégration native avec OpenSearch permet une utilisation fluide via des API dédiées, tout en offrant la flexibilité de fonctionner en dehors du cluster grâce à des frameworks comme HuggingFace. Sa capacité à identifier des correspondances sémantiques même en l'absence de chevauchement lexical entre requêtes et documents le distingue des approches traditionnelles.

Documentation

opensearch-neural-sparse-encoding-v2-distill

Select the model

The model should be selected considering search relevance, model inference and retrieval efficiency(FLOPS). We benchmark models' zero-shot performance on a subset of BEIR benchmark: TrecCovid,NFCorpus,NQ,HotpotQA,FiQA,ArguAna,Touche,DBPedia,SCIDOCS,FEVER,Climate FEVER,SciFact,Quora.

Overall, the v2 series of models have better search relevance, efficiency and inference speed than the v1 series. The specific advantages and disadvantages may vary across different datasets.

ModelInference-free for RetrievalModel ParametersAVG NDCG@10AVG FLOPS
opensearch-neural-sparse-encoding-v1133M0.52411.4
opensearch-neural-sparse-encoding-v2-distill67M0.5288.3
opensearch-neural-sparse-encoding-doc-v1✔️133M0.4902.3
opensearch-neural-sparse-encoding-doc-v2-distill✔️67M0.5041.8
opensearch-neural-sparse-encoding-doc-v2-mini✔️23M0.4971.7
opensearch-neural-sparse-encoding-doc-v3-distill✔️67M0.5171.8
opensearch-neural-sparse-encoding-doc-v3-gte✔️133M0.5461.7

Overview

  • Paper: Towards Competitive Search Relevance For Inference-Free Learned Sparse Retrievers
  • Fine-tuning sample: opensearch-sparse-model-tuning-sample

This is a learned sparse retrieval model. It encodes the queries and documents to 30522 dimensional sparse vectors. The non-zero dimension index means the corresponding token in the vocabulary, and the weight means the importance of the token.

The training datasets includes MS MARCO, eli5_question_answer, squad_pairs, WikiAnswers, yahoo_answers_title_question, gooaq_pairs, stackexchange_duplicate_questions_body_body, wikihow, S2ORC_title_abstract, stackexchange_duplicate_questions_title-body_title-body, yahoo_answers_question_answer, searchQA_top5_snippets, stackexchange_duplicate_questions_title_title, yahoo_answers_title_answer.

OpenSearch neural sparse feature supports learned sparse retrieval with lucene inverted index. Link: https://opensearch.org/docs/latest/query-dsl/specialized/neural-sparse/. The indexing and search can be performed with OpenSearch high-level API.

Usage (Sentence Transformers)

First install the Sentence Transformers library:

Bash
pip install -U sentence-transformers

Then you can load this model and run inference.

Python
from sentence_transformers.sparse_encoder import SparseEncoder

# Download from the 🤗 Hub
model = SparseEncoder("opensearch-project/opensearch-neural-sparse-encoding-v2-distill")

query = "What's the weather in ny now?"
document = "Currently New York is rainy."

query_embed = model.encode_query(query)
document_embed = model.encode_document(document)

sim = model.similarity(query_embed, document_embed)
print(f"Similarity: {sim}")
# Similarity: tensor([[38.6113]])

decoded_query = model.decode(query_embed)
decoded_document = model.decode(document_embed)

for i in range(len(decoded_query)):
    query_token, query_score = decoded_query[i]
    doc_score = next((score for token, score in decoded_document if token == query_token), 0)
    if doc_score != 0:
        print(f"Token: {query_token}, Query score: {query_score:.4f}, Document score: {doc_score:.4f}")

# Token: york, Query score: 2.7273, Document score: 2.9088
# Token: now, Query score: 2.5734, Document score: 0.9208
# Token: ny, Query score: 2.3895, Document score: 1.7237
# Token: weather, Query score: 2.2184, Document score: 1.2368
# Token: current, Query score: 1.8693, Document score: 1.4146
# Token: today, Query score: 1.5888, Document score: 0.7450
# Token: sunny, Query score: 1.4704, Document score: 0.9247
# Token: nyc, Query score: 1.4374, Document score: 1.9737
# Token: currently, Query score: 1.4347, Document score: 1.6019
# Token: climate, Query score: 1.1605, Document score: 0.9794
# Token: upstate, Query score: 1.0944, Document score: 0.7141
# Token: forecast, Query score: 1.0471, Document score: 0.5519
# Token: verve, Query score: 0.9268, Document score: 0.6692
# Token: huh, Query score: 0.9126, Document score: 0.4486
# Token: greene, Query score: 0.8960, Document score: 0.7706
# Token: picturesque, Query score: 0.8779, Document score: 0.7120
# Token: pleasantly, Query score: 0.8471, Document score: 0.4183
# Token: windy, Query score: 0.8079, Document score: 0.2140
# Token: favorable, Query score: 0.7537, Document score: 0.4925
# Token: rain, Query score: 0.7519, Document score: 2.1456
# Token: skies, Query score: 0.7277, Document score: 0.3818
# Token: lena, Query score: 0.6995, Document score: 0.8593
# Token: sunshine, Query score: 0.6895, Document score: 0.2410
# Token: johnny, Query score: 0.6621, Document score: 0.3016
# Token: skyline, Query score: 0.6604, Document score: 0.1933
# Token: sasha, Query score: 0.6117, Document score: 0.2197
# Token: vibe, Query score: 0.5962, Document score: 0.0414
# Token: hardly, Query score: 0.5381, Document score: 0.7560
# Token: prevailing, Query score: 0.4583, Document score: 0.4243
# Token: unpredictable, Query score: 0.4539, Document score: 0.5073
# Token: presently, Query score: 0.4350, Document score: 0.8463
# Token: hail, Query score: 0.3674, Document score: 0.2496
# Token: shivered, Query score: 0.3324, Document score: 0.5506
# Token: wind, Query score: 0.3281, Document score: 0.1964
# Token: rudy, Query score: 0.3052, Document score: 0.5785
# Token: looming, Query score: 0.2797, Document score: 0.0357
# Token: atmospheric, Query score: 0.2712, Document score: 0.0870
# Token: vicky, Query score: 0.2471, Document score: 0.3490
# Token: sandy, Query score: 0.2247, Document score: 0.2383
# Token: crowded, Query score: 0.2154, Document score: 0.5737
# Token: chilly, Query score: 0.1723, Document score: 0.1857
# Token: blizzard, Query score: 0.1700, Document score: 0.4110
# Token: ##cken, Query score: 0.1183, Document score: 0.0613
# Token: unrest, Query score: 0.0923, Document score: 0.6363
# Token: russ, Query score: 0.0624, Document score: 0.2127
# Token: blackout, Query score: 0.0558, Document score: 0.5542
# Token: kahn, Query score: 0.0549, Document score: 0.1589
# Token: 2020, Query score: 0.0160, Document score: 0.0566
# Token: nighttime, Query score: 0.0125, Document score: 0.3753

Usage (HuggingFace)

This model is supposed to run inside OpenSearch cluster. But you can also use it outside the cluster, with HuggingFace models API.

Python
import itertools
import torch
from transformers import AutoModelForMaskedLM, AutoTokenizer


# get sparse vector from dense vectors with shape batch_size * seq_len * vocab_size
def get_sparse_vector(feature, output):
    values, _ = torch.max(output*feature["attention_mask"].unsqueeze(-1), dim=1)
    values = torch.log(1 + torch.relu(values))
    values[:,special_token_ids] = 0
    return values
    
# transform the sparse vector to a dict of (token, weight)
def transform_sparse_vector_to_dict(sparse_vector):
    sample_indices,token_indices=torch.nonzero(sparse_vector,as_tuple=True)
    non_zero_values = sparse_vector[(sample_indices,token_indices)].tolist()
    number_of_tokens_for_each_sample = torch.bincount(sample_indices).cpu().tolist()
    tokens = [transform_sparse_vector_to_dict.id_to_token[_id] for _id in token_indices.tolist()]

    output = []
    end_idxs = list(itertools.accumulate([0]+number_of_tokens_for_each_sample))
    for i in range(len(end_idxs)-1):
        token_strings = tokens[end_idxs[i]:end_idxs[i+1]]
        weights = non_zero_values[end_idxs[i]:end_idxs[i+1]]
        output.append(dict(zip(token_strings, weights)))
    return output
    

# load the model
model = AutoModelForMaskedLM.from_pretrained("opensearch-project/opensearch-neural-sparse-encoding-v2-distill")
tokenizer = AutoTokenizer.from_pretrained("opensearch-project/opensearch-neural-sparse-encoding-v2-distill")

# set the special tokens and id_to_token transform for post-process
special_token_ids = [tokenizer.vocab[token] for token in tokenizer.special_tokens_map.values()]
get_sparse_vector.special_token_ids = special_token_ids
id_to_token = ["" for i in range(tokenizer.vocab_size)]
for token, _id in tokenizer.vocab.items():
    id_to_token[_id] = token
transform_sparse_vector_to_dict.id_to_token = id_to_token



query = "What's the weather in ny now?"
document = "Currently New York is rainy."

# encode the query & document
feature = tokenizer([query, document], padding=True, truncation=True, return_tensors='pt')
output = model(**feature)[0]
sparse_vector = get_sparse_vector(feature, output)

# get similarity score
sim_score = torch.matmul(sparse_vector[0],sparse_vector[1])
print(sim_score)   # tensor(38.6112, grad_fn=<DotBackward0>)


query_token_weight, document_query_token_weight = transform_sparse_vector_to_dict(sparse_vector)
for token in sorted(query_token_weight, key=lambda x:query_token_weight[x], reverse=True):
    if token in document_query_token_weight:
        print("score in query: %.4f, score in document: %.4f, token: %s"%(query_token_weight[token],document_query_token_weight[token],token))
        

        
# result:
# score in query: 2.7273, score in document: 2.9088, token: york
# score in query: 2.5734, score in document: 0.9208, token: now
# score in query: 2.3895, score in document: 1.7237, token: ny
# score in query: 2.2184, score in document: 1.2368, token: weather
# score in query: 1.8693, score in document: 1.4146, token: current
# score in query: 1.5887, score in document: 0.7450, token: today
# score in query: 1.4704, score in document: 0.9247, token: sunny
# score in query: 1.4374, score in document: 1.9737, token: nyc
# score in query: 1.4347, score in document: 1.6019, token: currently
# score in query: 1.1605, score in document: 0.9794, token: climate
# score in query: 1.0944, score in document: 0.7141, token: upstate
# score in query: 1.0471, score in document: 0.5519, token: forecast
# score in query: 0.9268, score in document: 0.6692, token: verve
# score in query: 0.9126, score in document: 0.4486, token: huh
# score in query: 0.8960, score in document: 0.7706, token: greene
# score in query: 0.8779, score in document: 0.7120, token: picturesque
# score in query: 0.8471, score in document: 0.4183, token: pleasantly
# score in query: 0.8079, score in document: 0.2140, token: windy
# score in query: 0.7537, score in document: 0.4925, token: favorable
# score in query: 0.7519, score in document: 2.1456, token: rain
# score in query: 0.7277, score in document: 0.3818, token: skies
# score in query: 0.6995, score in document: 0.8593, token: lena
# score in query: 0.6895, score in document: 0.2410, token: sunshine
# score in query: 0.6621, score in document: 0.3016, token: johnny
# score in query: 0.6604, score in document: 0.1933, token: skyline
# score in query: 0.6117, score in document: 0.2197, token: sasha
# score in query: 0.5962, score in document: 0.0414, token: vibe
# score in query: 0.5381, score in document: 0.7560, token: hardly
# score in query: 0.4582, score in document: 0.4243, token: prevailing
# score in query: 0.4539, score in document: 0.5073, token: unpredictable
# score in query: 0.4350, score in document: 0.8463, token: presently
# score in query: 0.3674, score in document: 0.2496, token: hail
# score in query: 0.3324, score in document: 0.5506, token: shivered
# score in query: 0.3281, score in document: 0.1964, token: wind
# score in query: 0.3052, score in document: 0.5785, token: rudy
# score in query: 0.2797, score in document: 0.0357, token: looming
# score in query: 0.2712, score in document: 0.0870, token: atmospheric
# score in query: 0.2471, score in document: 0.3490, token: vicky
# score in query: 0.2247, score in document: 0.2383, token: sandy
# score in query: 0.2154, score in document: 0.5737, token: crowded
# score in query: 0.1723, score in document: 0.1857, token: chilly
# score in query: 0.1700, score in document: 0.4110, token: blizzard
# score in query: 0.1183, score in document: 0.0613, token: ##cken
# score in query: 0.0923, score in document: 0.6363, token: unrest
# score in query: 0.0624, score in document: 0.2127, token: russ
# score in query: 0.0558, score in document: 0.5542, token: blackout
# score in query: 0.0549, score in document: 0.1589, token: kahn
# score in query: 0.0160, score in document: 0.0566, token: 2020
# score in query: 0.0125, score in document: 0.3753, token: nighttime

The above code sample shows an example of neural sparse search. Although there is no overlap token in original query and document, but this model performs a good match.

Detailed Search Relevance

ModelAverageTrec CovidNFCorpusNQHotpotQAFiQAArguAnaToucheDBPediaSCIDOCSFEVERClimate FEVERSciFactQuora
opensearch-neural-sparse-encoding-v10.5240.7710.3600.5530.6970.3760.5080.2780.4470.1640.8210.2630.7230.856
opensearch-neural-sparse-encoding-v2-distill0.5280.7750.3470.5610.6850.3740.5510.2780.4350.1730.8490.2490.7220.863
opensearch-neural-sparse-encoding-doc-v10.4900.7070.3520.5210.6770.3440.4610.2940.4120.1540.7430.2020.7160.788
opensearch-neural-sparse-encoding-doc-v2-distill0.5040.6900.3430.5280.6750.3570.4960.2870.4180.1660.8180.2240.7150.841
opensearch-neural-sparse-encoding-doc-v2-mini0.4970.7090.3360.5100.6660.3380.4800.2850.4070.1640.8120.2160.6990.837
opensearch-neural-sparse-encoding-doc-v3-distill0.5170.7240.3450.5440.6940.3560.5200.2940.4240.1630.8450.2390.7080.863
opensearch-neural-sparse-encoding-doc-v3-gte0.5460.7340.3600.5820.7160.4070.5200.3890.4550.1670.8600.3120.7250.873

License

This project is licensed under the Apache v2.0 License.

Copyright

Copyright OpenSearch Contributors. See NOTICE for details.

Liens & Ressources
Spécifications
CatégorieEmbedding
AccèsAPI & Local
LicenceOpen Source
TarificationOpen Source
Note
1.3

Essayer opensearch neural sparse encoding v2 distill

Accédez directement au modèle