nomic embed text v2 moe distilled high quality

Name: nomic embed text v2 moe distilled high quality
Rating: 0.9 (4 reviews)

by cnmoro

Open source · 42k downloads · 4 likes

0.9

(4 reviews)EmbeddingAPI & Local

About

This model, named *nomic embed text v2 moe distilled high quality*, is an optimized and distilled version of the *nomic-embed-text-v2-moe* model, designed to generate high-quality text embeddings. It converts texts into dense 768-dimensional vectors, finely capturing their semantics for tasks such as search, classification, or similarity comparison. Its distillation process, based on training with 23 million data triplets, enhances performance while reducing complexity, making it more accessible and efficient. Its primary use cases include information retrieval, document analysis, or comparing textual content, where its ability to produce precise and contextualized representations is a major advantage. What sets it apart is its innovative distillation method, combining the *Model2Vec* approach with training on massive datasets, ensuring a balance between performance and efficiency.

Documentation

This Model2Vec model was created by using Tokenlearn, with nomic-embed-text-v2-moe as a base.

The output dimension is 768.

The evaluation in the model card, was executed using this model (distilled), not the original.

The process to create this one, was not a simple model2vec distill, this involved generating embeddings for 23M triplets (msmarco) with the original model, then training the tokenlearn model on it, with the nomic model as a base.

Usage

Load this model using model2vec library:

Python

from model2vec import StaticModel

model = StaticModel.from_pretrained("cnmoro/nomic-embed-text-v2-moe-distilled-high-quality")

# Compute text embeddings
embeddings = model.encode(["Example sentence"])

Or using sentence-transformers library:

Python

from sentence_transformers import SentenceTransformer

model = SentenceTransformer('cnmoro/nomic-embed-text-v2-moe-distilled-high-quality')

# Compute text embeddings
embeddings = model.encode(["Example sentence"])

Capabilities & Tags

sentence-transformerssafetensorsmodel2vecmtebembeddingsstatic-embeddingsfeature-extractionenptmodel-index

Links & Resources

nomic embed text v2 moe distilled high quality

by cnmoro

Open source · 42k downloads · 4 likes

0.9

(4 reviews)EmbeddingAPI & Local

About

Documentation

This Model2Vec model was created by using Tokenlearn, with nomic-embed-text-v2-moe as a base.

The output dimension is 768.

The evaluation in the model card, was executed using this model (distilled), not the original.

Usage

Load this model using model2vec library:

Python

from model2vec import StaticModel

model = StaticModel.from_pretrained("cnmoro/nomic-embed-text-v2-moe-distilled-high-quality")

# Compute text embeddings
embeddings = model.encode(["Example sentence"])

Or using sentence-transformers library:

Python

from sentence_transformers import SentenceTransformer

model = SentenceTransformer('cnmoro/nomic-embed-text-v2-moe-distilled-high-quality')

# Compute text embeddings
embeddings = model.encode(["Example sentence"])

Capabilities & Tags

sentence-transformerssafetensorsmodel2vecmtebembeddingsstatic-embeddingsfeature-extractionenptmodel-index

Links & Resources