Sentence Transformers

Compute, train, and rerank text embeddings in Python

github.com/huggingface/sentence-transformers★ 18.8k sbert.net

Overview

Sentence Transformers is a Python framework for computing text embeddings with sentence, reranker (cross-encoder), and sparse encoder models. You load a pretrained model, call encode() on your texts, and get back vectors you can compare with a similarity function. It works with over 15,000 pretrained models on Hugging Face, including many from the MTEB leaderboard.

It is aimed at developers building semantic search, retrieval, and text-similarity features, as well as teams that want to train or finetune their own embedding and reranker models. The same library covers both the dense embeddings used to find candidate documents and the cross-encoder rerankers used to score those candidates more precisely.

Within the embeddings category, it is the standard high-level toolkit for turning text into vectors and reranking results. It also supports sparse encoder models (such as SPLADE) for keyword-style sparse representations, so you can mix dense, sparse, and reranking stages in one pipeline.

What it does

Compute dense text embeddings with a one-line model.encode() call that returns numpy arrays
Score and rank query-passage pairs with CrossEncoder reranker models via predict() or rank()
Generate sparse embeddings with SparseEncoder models like SPLADE, including sparsity stats
Built-in similarity helpers via model.similarity() to compare embeddings
Access to 15,000+ pretrained models on Hugging Face, covering 100+ languages
Train or finetune your own embedding, reranker, and sparse encoder models

Getting started

Install the package, then load a pretrained model and encode some text.

Install

Install from PyPI. Python 3.10+, PyTorch 1.11.0+, and transformers v4.41.0+ are recommended.

bashbash

pip install -U sentence-transformers

Compute embeddings

Load a Sentence Transformer model and encode a list of sentences into vectors.

pythonpython

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

sentences = [
    "The weather is lovely today.",
    "It's so sunny outside!",
    "He drove to the stadium.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# => (3, 384)

Compare similarity

Use the model's similarity helper to compare the embeddings.

pythonpython

similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6660, 0.1046],
#         [0.6660, 1.0000, 0.1411],
#         [0.1046, 0.1411, 1.0000]])

Rerank with a Cross Encoder

Load a reranker model and rank passages for a query without manual sorting.

pythonpython

from sentence_transformers import CrossEncoder

model = CrossEncoder("cross-encoder/ms-marco-MiniLM-L6-v2")

query = "How many people live in Berlin?"
passages = [
    "Berlin had a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.",
    "Berlin has a yearly total of about 135 million day visitors, making it one of the most-visited cities in the European Union.",
    "In 2013 around 600,000 Berliners were registered in one of the more than 2,300 sport and fitness clubs.",
]
ranks = model.rank(query, passages, return_documents=True)

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

Build semantic search over a document collection by embedding the corpus and matching queries by similarity
Rerank an initial set of retrieved candidates with a cross-encoder for more accurate top results in a RAG pipeline
Measure semantic textual similarity or run paraphrase mining across large text sets
Train or finetune a custom embedding or reranker model for a domain-specific dataset

How Sentence Transformers compares

Sentence Transformers alongside other open-source embedding models & inference tools AI/TLDR tracks, ranked by GitHub stars.

Tool	Stars	What it does
Sentence Transformers	★ 18.8k	Compute, train, and rerank text embeddings in Python
EmbeddingGemma (Gemma)	★ 5.5k	Google DeepMind's Gemma repo, home to EmbeddingGemma, a 308M multilingual embedding model small enough to run on-device for RAG and semantic search.
Text Embeddings Inference (TEI)	★ 4.9k	Hugging Face's Rust-based server for deploying embedding, reranking, and sequence-classification models with high throughput on GPU or CPU.
Infinity (Embeddings)	★ 2.8k	A high-throughput serving engine for text embeddings, rerankers, CLIP, and ColPali models, exposing an OpenAI-compatible API.
ColPali	★ 2.7k	A vision-language embedding model that indexes whole document page images for retrieval, avoiding the need to parse PDFs into text first.
Model2Vec	★ 2.1k	A tool that distills any sentence transformer into a tiny, fast static embedding model (the Potion models) that runs on CPU without a neural network at inference.
Instructor Embedding	★ 2k	Instruction-tuned text embedding models that let you tailor embeddings to a task by prepending a natural-language instruction.
Qwen3-Embedding	★ 2k	Alibaba's open embedding and reranking models built on the Qwen3 base, available in 0.6B/4B/8B sizes and covering over 100 languages.

// Overview

// What it does

// Getting started

Install

Compute embeddings

Compare similarity

Rerank with a Cross Encoder

// When to use it

// How Sentence Transformers compares

Overview

What it does

Getting started

When to use it

How Sentence Transformers compares