by Xenova
Open source · 111k downloads · 16 likes
This model, *paraphrase multilingual MiniLM L12 v2*, is designed to generate vector representations (embeddings) of texts in multiple languages, facilitating semantic comparison between sentences or documents. It excels in paraphrase detection and multilingual information retrieval, offering a lightweight yet high-performing alternative to larger models. Its primary use cases include text similarity analysis, document classification, and multilingual content recommendation. What sets it apart is its balance between efficiency and accuracy, thanks to an architecture optimized for large-scale semantic comprehension tasks.
https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 with ONNX weights to be compatible with Transformers.js.
If you haven't already, you can install the Transformers.js JavaScript library from NPM using:
npm i @huggingface/transformers
Example: Run feature extraction.
import { pipeline } from '@huggingface/transformers';
const extractor = await pipeline('feature-extraction', 'Xenova/paraphrase-multilingual-MiniLM-L12-v2');
const output = await extractor('This is a simple test.');
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named onnx).