AI/EXPLORER
ToolsCategoriesSitesLLMsCompareAI QuizAlternativesPremium
—AI Tools
—Sites & Blogs
—LLMs & Models
—Categories
AI Explorer

Find and compare the best artificial intelligence tools for your projects.

Made within France

Explore

  • ›All tools
  • ›Sites & Blogs
  • ›LLMs & Models
  • ›Compare
  • ›Chatbots
  • ›AI Images
  • ›Code & Dev

Company

  • ›Premium
  • ›About
  • ›Contact
  • ›Blog

Legal

  • ›Legal notice
  • ›Privacy
  • ›Terms

© 2026 AI Explorer·All rights reserved.

HomeLLMsgte small

gte small

by Xenova

Open source · 18k downloads · 23 likes

1.7
(23 reviews)EmbeddingAPI & Local
About

The "gte small" model is an optimized version of the GTE (General Text Embeddings) model, specifically designed to generate high-quality text embeddings. It converts text into dense numerical vectors, enabling tasks such as semantic search, classification, or document similarity comparison. Thanks to its compatibility with ONNX and Transformers.js, it is particularly well-suited for use in JavaScript environments, including web applications or tools requiring lightweight yet high-performance integration. This model stands out for its ability to strike a strong balance between accuracy and efficiency while remaining accessible for deployment in constrained environments. Its use cases include text analysis, data organization, and enhancing user experience through personalized recommendations.

Documentation

https://huggingface.co/thenlper/gte-small with ONNX weights to be compatible with Transformers.js.

Usage (Transformers.js)

If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

Bash
npm i @huggingface/transformers

You can then use the model to compute embeddings like this:

Js
import { pipeline } from '@huggingface/transformers';

// Create a feature-extraction pipeline
const extractor = await pipeline('feature-extraction', 'Xenova/gte-small');

// Compute sentence embeddings
const sentences = ['That is a happy person', 'That is a very happy person'];
const output = await extractor(sentences, { pooling: 'mean', normalize: true });
console.log(output);
// Tensor {
//   dims: [ 2, 384 ],
//   type: 'float32',
//   data: Float32Array(768) [ -0.053555335849523544, 0.00843878649175167, ... ],
//   size: 768
// }

// Compute cosine similarity
import { cos_sim } from '@huggingface/transformers';
console.log(cos_sim(output[0].data, output[1].data))
// 0.9798319649182318

You can convert this Tensor to a nested JavaScript array using .tolist():

Js
console.log(output.tolist());
// [
//   [ -0.053555335849523544, 0.00843878649175167, 0.06234041228890419, ... ],
//   [ -0.049980051815509796, 0.03879701718688011, 0.07510733604431152, ... ]
// ]

By default, an 8-bit quantized version of the model is used, but you can choose to use the full-precision (fp32) version by specifying { dtype: 'fp32' } in the pipeline function:

Js
const extractor = await pipeline('feature-extraction', 'Xenova/gte-small', { 
    dtype: 'fp32'  // Options: "fp32", "fp16", "q8", "q4"
});

Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named onnx).

Capabilities & Tags
transformers.jsonnxbertfeature-extraction
Links & Resources
Specifications
CategoryEmbedding
AccessAPI & Local
LicenseOpen Source
PricingOpen Source
Rating
1.7

Try gte small

Access the model directly