AI/TLDR

AnglE

Train and run BERT- and LLM-based sentence embeddings with angle-optimized objectives

Overview

AnglE (angle_emb) is a Python library for training and running sentence embedding models. It comes from the paper "AnglE: Angle-optimized Text Embeddings" and lets you train BERT- or LLM-based embedding models, then use the same library to run inference over a range of transformer backbones.

It is aimed at machine learning engineers and NLP developers who need text embeddings for tasks like semantic similarity, retrieval, and clustering. You can load ready-made models from Hugging Face (such as WhereIsAI/UAE-Large-V1) or fine-tune your own with the included loss functions.

As an embedding-models tool, AnglE covers both ends: a training framework with several contrastive objectives (AnglE, CoSENT, contrastive, and Espresso/2DMSE losses) and an inference framework that supports BERT-family models, LLM-based models via LoRA, and bidirectional LLMs.

What it does

  • Angle-optimized (AnglE) loss plus CoSENT, contrastive, and Espresso/2DMSE loss options for training embeddings
  • Works with BERT-family backbones (BERT, RoBERTa, ModernBERT) and LLM backbones (LLaMA, Mistral, Qwen)
  • Bidirectional LLM embeddings via BiLLM (apply_billm=True)
  • Loads pretrained models from Hugging Face, including UAE-Large-V1 and pubmed-angle models
  • Single-GPU and multi-GPU training support
  • Configurable pooling strategies (cls, last token) and optional prompts for retrieval tasks

Getting started

Install the angle-emb package, then load a pretrained model and encode text into vectors. A CUDA-capable GPU is expected for the .cuda() calls below.

Install angle-emb

Install from PyPI with uv or pip.

bashbash
pip install -U angle-emb

Encode text with a BERT-based model

Load a pretrained model, encode documents, and compare them with cosine similarity.

pythonpython
from angle_emb import AnglE
from angle_emb.utils import cosine_similarity

angle = AnglE.from_pretrained('WhereIsAI/UAE-Large-V1', pooling_strategy='cls').cuda()

doc_vecs = angle.encode([
    'The weather is great!',
    'The weather is very good!',
    'i am going to bed'
])

for i, dv1 in enumerate(doc_vecs):
    for dv2 in doc_vecs[i+1:]:
        print(cosine_similarity(dv1, dv2))

Use prompts for retrieval

For retrieval tasks, encode the query with a prompt and the documents without one. List available prompts with Prompts.list_prompts().

pythonpython
from angle_emb import AnglE, Prompts
from angle_emb.utils import cosine_similarity

angle = AnglE.from_pretrained('WhereIsAI/UAE-Large-V1', pooling_strategy='cls').cuda()

qv = angle.encode(['what is the weather?'], to_numpy=True, prompt=Prompts.C)
doc_vecs = angle.encode([
    'The weather is great!',
    'it is rainy today.',
    'i am going to bed'
], to_numpy=True)

for dv in doc_vecs:
    print(cosine_similarity(qv[0], dv))

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

  • Building semantic search or retrieval by embedding queries and documents and ranking by cosine similarity
  • Measuring semantic textual similarity between sentence pairs
  • Fine-tuning a custom embedding model on your own data with AnglE, CoSENT, or contrastive loss
  • Generating embeddings for clustering, deduplication, or RAG pipelines using a pretrained model like UAE-Large-V1

How AnglE compares

AnglE alongside other open-source embedding models & inference tools AI/TLDR tracks, ranked by GitHub stars.

ToolStarsWhat it does
Sentence Transformers★ 18.8kThe standard Python framework for loading, training, and computing embeddings with sentence and reranking models.
EmbeddingGemma (Gemma)★ 5.5kGoogle DeepMind's Gemma repo, home to EmbeddingGemma, a 308M multilingual embedding model small enough to run on-device for RAG and semantic search.
Text Embeddings Inference (TEI)★ 4.9kHugging Face's Rust-based server for deploying embedding, reranking, and sequence-classification models with high throughput on GPU or CPU.
Infinity (Embeddings)★ 2.8kA high-throughput serving engine for text embeddings, rerankers, CLIP, and ColPali models, exposing an OpenAI-compatible API.
ColPali★ 2.7kA vision-language embedding model that indexes whole document page images for retrieval, avoiding the need to parse PDFs into text first.
Model2Vec★ 2.1kA tool that distills any sentence transformer into a tiny, fast static embedding model (the Potion models) that runs on CPU without a neural network at inference.
Instructor Embedding★ 2kInstruction-tuned text embedding models that let you tailor embeddings to a task by prepending a natural-language instruction.
AnglE★ 571Train and run BERT- and LLM-based sentence embeddings with angle-optimized objectives