by danielheinz
Open source · 160k downloads · 17 likes
This model, named "e5 base sts en de," is a specialized version of the multilingual *multilingual-e5-base* model, optimized for evaluating semantic similarity between English and German texts. It has been fine-tuned on German paraphrase and textual similarity corpora, enabling it to finely grasp and compare the nuances between the two languages. Its primary capabilities lie in analyzing semantic proximity, making it useful for tasks such as multilingual information retrieval, paraphrase detection, or evaluating textual coherence. It stands out for its high accuracy, as evidenced by its scores exceeding 0.9 on benchmark datasets, and its adaptability to bilingual contexts. This model is particularly well-suited for applications requiring a nuanced understanding of relationships between texts in these two languages.
INFO: The model is being continuously updated.
The model is a multilingual-e5-base model fine-tuned with the task of semantic textual similarity in mind.
The model has been fine-tuned on the German subsets of the following datasets:
The training procedure can be divided into two stages:
The model achieves the following results: