by mixedbread-ai
Open source · 40k downloads · 31 likes
The mxbai-embed-xsmall-v1 model is a lightweight embedding model designed to convert text into high-quality numerical vectors, making it ideal for semantic search, classification, or clustering tasks. Its core capabilities include contextual language understanding, enabling it to capture nuances and relationships between words to produce accurate and useful vector representations. It stands out for its efficiency, delivering strong performance even on limited resources while remaining effective for diverse applications such as document analysis, content recommendation, or search engine optimization. Its compact design makes it particularly well-suited for environments where speed and lightweight operation are critical, without compromising result quality.
The crispy sentence embedding family from Mixedbread.
🍞 Looking for a simple end-to-end retrieval solution? Meet Omni, our multimodal and multilingual model. Get in touch for access.
This model is an open-source English embedding model developed by Mixedbread. It's built upon sentence-transformers/all-MiniLM-L6-v2 and trained with the AnglE loss and Espresso. Read more details in our blog post.
In a bread loaf:
Our model supports both binary quantization and Matryoshka Representation Learning (MRL), allowing for significant efficiency gains:
These optimizations can lead to substantial reductions in infrastructure costs for cloud computing and vector databases. Read more here.
Here are several ways to produce German sentence embeddings using our model.
pip install -U angle-emb
from angle_emb import AnglE
from angle_emb.utils import cosine_similarity
# 1. Specify preferred dimensions
dimensions = 384
# 2. Load model and set pooling strategy to avg
model = AnglE.from_pretrained(
"mixedbread-ai/mxbai-embed-xsmall-v1",
pooling_strategy='avg').cuda()
query = 'A man is eating a piece of bread'
docs = [
query,
"A man is eating food.",
"A man is eating pasta.",
"The girl is carrying a baby.",
"A man is riding a horse.",
]
# 3. Encode
embeddings = model.encode(docs, embedding_size=dimensions)
for doc, emb in zip(docs[1:], embeddings[1:]):
print(f'{query} ||| {doc}', cosine_similarity(embeddings[0], emb))
python -m pip install -U sentence-transformers
from sentence_transformers import SentenceTransformer
from sentence_transformers.util import cos_sim
# 1. Specify preferred dimensions
dimensions = 384
# 2. Load model
model = SentenceTransformer("mixedbread-ai/mxbai-embed-xsmall-v1", truncate_dim=dimensions)
query = 'A man is eating a piece of bread'
docs = [
query,
"A man is eating food.",
"A man is eating pasta.",
"The girl is carrying a baby.",
"A man is riding a horse.",
]
# 3. Encode
embeddings = model.encode(docs)
similarities = cos_sim(embeddings[0], embeddings[1:])
print('similarities:', similarities)
pip install -U transformers
from typing import Dict
import torch
import numpy as np
from transformers import AutoModel, AutoTokenizer
from sentence_transformers.util import cos_sim
def pooling(outputs: torch.Tensor, inputs: Dict) -> np.ndarray:
outputs = torch.sum(
outputs * inputs["attention_mask"][:, :, None], dim=1) / torch.sum(inputs["attention_mask"])
return outputs.detach().cpu().numpy()
# 1. Load model
model_id = 'mixedbread-ai/mxbai-embed-xsmall-v1'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModel.from_pretrained(model_id).cuda()
query = 'A man is eating a piece of bread'
docs = [
query,
"A man is eating food.",
"A man is eating pasta.",
"The girl is carrying a baby.",
"A man is riding a horse.",
]
# 2. Encode
inputs = tokenizer(docs, padding=True, return_tensors='pt')
for k, v in inputs.items():
inputs[k] = v.cuda()
outputs = model(**inputs).last_hidden_state
embeddings = pooling(outputs, inputs)
# 3. Compute similarity scores
similarities = cos_sim(embeddings[0], embeddings[1:])
print('similarities:', similarities)
python -m pip install batched
import uvicorn
import batched
from fastapi import FastAPI
from fastapi.responses import ORJSONResponse
from sentence_transformers import SentenceTransformer
from pydantic import BaseModel
app = FastAPI()
model = SentenceTransformer('mixedbread-ai/mxbai-embed-xsmall-v1')
model.encode = batched.aio.dynamically(model.encode)
class EmbeddingsRequest(BaseModel):
input: str | list[str]
@app.post("/embeddings")
async def embeddings(request: EmbeddingsRequest):
return ORJSONResponse({"embeddings": await model.encode(request.input)})
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=8000)
Join our discord community to share your feedback and thoughts. We're here to help and always happy to discuss the exciting field of machine learning!
Apache 2.0
@online{xsmall2024mxbai,
title={Every Byte Matters: Introducing mxbai-embed-xsmall-v1},
author={Sean Lee and Julius Lipp and Rui Huang and Darius Koenig},
year={2024},
url={https://www.mixedbread.ai/blog/mxbai-embed-xsmall-v1},
}