specter

Name: specter
Rating: 2.3 (65 reviews)

by allenai

Open source · 16k downloads · 65 likes

2.3

(65 reviews)EmbeddingAPI & Local

About

SPECTER is a language model designed to generate scalable vector representations (embeddings) of documents by leveraging the relationships between scientific publications through their citation graphs. Unlike traditional models, it produces relevant embeddings without requiring task-specific fine-tuning, making it particularly effective for analyzing academic texts. Its primary use cases include recommending articles, classifying documents, or retrieving information from scientific corpora. What sets it apart is its innovative approach, which incorporates citation context to capture richer semantic relationships between documents, thereby enhancing the quality of embeddings compared to conventional methods.

Documentation

SPECTER

SPECTER is a pre-trained language model to generate document-level embedding of documents. It is pre-trained on a powerful signal of document-level relatedness: the citation graph. Unlike existing pretrained language models, SPECTER can be easily applied to downstream applications without task-specific fine-tuning.

If you're coming here because you want to embed papers, SPECTER has now been superceded by SPECTER2. Use that instead.

Paper: SPECTER: Document-level Representation Learning using Citation-informed Transformers

Original Repo: Github

Evaluation Benchmark: SciDocs

Authors: Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld

Capabilities & Tags

transformerspytorchtfjaxbertfeature-extractionenendpoints_compatible

Links & Resources