PipeANN is a low-latency, billion-scale, and updatable graph-based vector store on SSD.
| Feature | Description |
|---|---|
| β‘ Ultra-Low Latency | <1ms for 1 billion vectors (top-10, 90% recall), only 1.14x-2.02x of in-memory index |
| π High Throughput | 20K QPS for 1 billion vectors, outperforming DiskANN and SPANN |
| π Efficient Updates | Insert/delete with minimal search interference (1.07x fluctuation) |
| π― User-Defined Filters | Supports arbitrary filtered ANNS via user-defined Label and Selector |
| πΎ Memory Efficient | >10x less memory than in-memory indexes (~40GB for 1B vectors) |
| π Easy-to-Use | Both Python (faiss-like) and C++ interfaces supported |
PipeANN is suitable for both large-scale and memory-constraint scenarios.
| Dataset | Dimension | Memory | Latency | QPS | PipeANN | HNSW | DiskANN |
|---|---|---|---|---|---|---|---|
| 1B (SPACEV) | 100 | 40GB | 2ms | 5K | β | β 1TB | β 6ms |
| 80M (Wiki) | 768 | 10GB | 1.5ms | 5K | β | β 300GB | β 4ms |
| 10M (SIFT) | 128 | 550MB | <1ms | 10K | β | β 4GB | β 3ms |
Recall@10 = 0.99, Samsung PM9A3 SSD, 32B PQ-compressed vectors (128B for Wiki).
- Dec 4, 2025: Inner product and filtered ANNS (arbitrary filter) supported
- Oct 14, 2025: RaBitQ (1-bit quantization) supported
- Sep 29, 2025: Python interface released
- Jul 16, 2025: Vector update (insert/delete) supported
| Component | Requirement |
|---|---|
| CPU | x86 or ARM with SIMD (AVX2/AVX512 recommended) |
| DRAM | ~40GB (search) or ~90GB (search+update) per billion vectors |
| SSD | ~700GB for 1B SIFT, ~900GB for 1.4B SPACEV |
- OS: Linux with
io_uring(fast) orlibaio(compatible) support, Ubuntu 22.04 recommended - Compiler: C++17 support required
- Constraints: <2B vectors to avoid integer overflow
# Ubuntu >= 22.04
sudo apt install make cmake g++ libaio-dev libgoogle-perftools-dev \
clang-format libmkl-full-dev libeigen3-dev
# For Python interface
pip3 install "pybind11[global]"libmkl could be replaced by other BLAS libraries (e.g., openblas).
cd third_party/liburing
./configure && make -j
cd ../..For Python interface:
python setup.py install # Installs `pipeann` wheelFor C++ interface:
bash ./build.sh # Binaries in build/For performance-critical scenarios, we recommend using C++ interface.
from pipeann import IndexPipeANN, Metric
# Create index
idx = IndexPipeANN(data_dim=128, data_type='float32', metric=Metric.L2)
idx.omp_set_num_threads(32) # the number of search/insert/delete threads.
idx.set_index_prefix(index_prefix) # the index is stored to {index_prefix}_disk.index
# Insert vectors in memory (auto-converts to disk index when >100K vectors)
idx.add(vectors, tags)
# For SSD index initialized using idx.add, out-neighbor number is fixed to 64.
# For large-scale datasets (>= 10M), we recommend using idx.build for initialization.
# idx.build(data_path, index_prefix)
# idx.load(index_prefix) # load the pre-built index from disk.
# Search using PipeSearch (on-SSD) or best-first search (in-memory)
results = idx.search(queries, topk=10, L=50)
idx.remove(tags) # remove vectors from the index with corresponding tags.
# The index should be saved after updates.
idx.save(index_prefix) # save the index.Run an example (hard-coded paths should be modified):
cd tests_py && python index_example.pyExample result:
python setup.py install
cd tests_py
# Please modify the hard-coded paths first!
python index_example.pyIt runs like this:
# Insert the first 100K vectors using in-memory index.
[index.cpp:68:INFO] Getting distance function for metric: l2
Building index with prefix /mnt/nvme/indices/bigann/1M...
# ...
Inserting the first 1M points 100000 to 110000 ...
# Transform the in-memory index to SSD index.
[pyindex.h:100:INFO] Transform memory index to disk index.
# ...
[pyindex.h:109:INFO] Transform memory index to disk index done.
# Insert the remaining 900K vectors, save, and reload the SSD index.
Inserting the first 1M points 110000 to 120000 ...
# ...
[ssd_index.cpp:206:INFO] SSDIndex loaded successfully.
# The first search in the SIFT1M dataset.
Searching for 10 nearest neighbors with L=10...
Search time: 0.6290 seconds for 10000 queries, throughput: 15897.957218870273 QPS.
Recall@10 with L=10: 0.7397
# ...
Searching for 10 nearest neighbors with L=50...
Search time: 0.8746 seconds for 10000 queries, throughput: 11433.789824882691 QPS.
Recall@10 with L=50: 0.9784
# Insert the second 1M vectors, save and reload.
Inserting 1M new vectors to the index ...
# ...
[ssd_index.cpp:206:INFO] SSDIndex loaded successfully.
# The second search in the SIFT2M dataset.
Searching for 10 nearest neighbors with L=10...
Search time: 0.6461 seconds for 10000 queries, throughput: 15477.096553625139 QPS.
Recall@10 with L=10: 0.7181
# ...
Searching for 10 nearest neighbors with L=50...
Search time: 0.8907 seconds for 10000 queries, throughput: 11227.508131590563 QPS.
Recall@10 with L=50: 0.9720Enable -DREAD_ONLY_TESTS and -DNO_MAPPING in CMakeLists.txt. This disables updates but achieves higher search performance.
For DiskANN users (existing on-disk index):
# Build in-memory entry point index (~10min for 1B vectors)
export INDEX_PREFIX=/mnt/nvme2/indices/bigann/100m # on-disk index filename is 100m_disk.index
export DATA_PATH=/mnt/nvme/data/bigann/100M.bbin
build/tests/utils/gen_random_slice uint8 ${DATA_PATH} ${INDEX_PREFIX}_SAMPLE_RATE_0.01 0.01
build/tests/build_memory_index uint8 ${INDEX_PREFIX}_SAMPLE_RATE_0.01_data.bin \
${INDEX_PREFIX}_SAMPLE_RATE_0.01_ids.bin ${INDEX_PREFIX}_mem.index 32 64 1.2 $(nproc) l2
# Search with PipeANN
build/tests/search_disk_index uint8 ${INDEX_PREFIX} 1 32 query.bin gt.bin 10 l2 pq 2 10 10 20 30 40Example results:
Search parameters: #threads: 1, beamwidth: 32
... some outputs during index loading ...
[search_disk_index.cpp:216:INFO] Use two ANNS for warming up...
[search_disk_index.cpp:219:INFO] Warming up finished.
L I/O Width QPS Mean Lat P99 Lat Mean Hops Mean IOs Recall@10
=========================================================================================
10 32 1952.03 490.99 3346.00 0.00 22.28 67.11
20 32 1717.53 547.84 1093.00 0.00 31.11 84.53
30 32 1538.67 608.31 1231.00 0.00 41.02 91.04
40 32 1420.46 655.24 1270.00 0.00 52.50 94.23
Starting from scratch:
1. Download datasets: SIFT, DEEP1B, SPACEV.
If the links are not available, you could get the datasets from Big ANN benchmarks.
SPACEV1B may comprises several sub-files. To concatenate them, save the dataset's numpy array to bin format (the following Python
code might be used).
# bin format:
# | 4 bytes for num_vecs | 4 bytes for vector dimension (e.g., 100 for SPACEV) | flattened
vectors |
def bin_write(vectors, filename):
with open(filename, 'wb') as f:
num_vecs, vector_dim = vectors.shape
f.write(struct.pack('<i', num_vecs))
f.write(struct.pack('<i', vector_dim))
f.write(vectors.tobytes())
def bin_read(filename):
with open(filename, 'rb') as f:
num_vecs = struct.unpack('<i', f.read(4))[0]
vector_dim = struct.unpack('<i', f.read(4))[0]
data = f.read(num_vecs * vector_dim * 4) # 4 bytes per float
vectors = np.frombuffer(data, dtype=np.float32).reshape((num_vecs, vector_dim))
return vectorsThe dataset should contain a ground truth file for its full set.
Some datasets also contain the ground truth of subsets (first idx_100M.ivecs of
SIFT1B dataset.
2. Convert format (if needed):
# convert .vecs to .bin
build/tests/utils/vecs_to_bin unt8 bigann_base.bvecs bigann.bin # for int8/uint8 vecs (SIFT)
build/tests/utils/vecs_to_bin float base.fvecs deep.bin # for float vecs (DEEP)
build/tests/utils/vecs_to_bin int32 idx_1000M.ibin # for int32/uint32 vecs (groundtruth)
# Generate 100M subsets (e.g., for SIFT and DEEP).
build/tests/utils/change_pts uint8 bigann.bin 100000000 # bigann.bin -> bigann.bin100000000
mv bigann.bin100000000 bigann_100M.bin
build/tests/utils/change_pts float deep.bin 100000000 # deep.bin -> deep.bin100000000
mv deep.bin100000000 deep_100M.bin
# Calculate Ground Truth for 100M subsets (SIFT100M example)
# compute_groundtruth <type> <metric> <data> <query> <topk> <output> null null
build/tests/utils/compute_groundtruth uint8 l2 bigann_100M.bin query.bin 1000 100M_gt.bin null null3. Build on-disk index:
# build_disk_index <type> <data> <prefix> <R> <L> <PQ_bytes> <M_GB> <threads> <metric> <nbr_type>
build/tests/build_disk_index uint8 data.bin index 96 128 32 256 112 l2 pqParameter explanation:
R: Maximum out-neighbors.L: Candidate pool size during build.PQ_bytes: Bytes per PQ vector (32 recommended, use a larger value if accuracy is low).M: Max memory (GB).nbr_type:pq(product quantization, supports update) orrabitq(1-bit quantization, search-only).
Recommended Parameters:
| Dataset | Type | R | L | PQ_bytes | Memory | Threads |
|---|---|---|---|---|---|---|
| 100M subsets | uint8/float/int8 | 96 | 128 | 32 | 256GB | 112 |
| SIFT1B | uint8 | 128 | 200 | 32 | 500GB | 112 |
| SPACEV1B | int8 | 128 | 200 | 32 | 500GB | 112 |
This requires ~5h for 100M-scale datasets, and ~1d for billion-scale datasets.
4. Build in-memory index (optional but recommended):
An in-memory index optimizes the entry point. Skip it by setting mem_L=0 in search.
build/tests/utils/gen_random_slice uint8 data.bin index_SAMPLE_RATE_0.01 0.01
build/tests/build_memory_index uint8 index_SAMPLE_RATE_0.01_data.bin \
index_SAMPLE_RATE_0.01_ids.bin index_mem.index 32 64 1.2 $(nproc) l2The output in-memory index should reside in three files: index_mem.index, index_mem.index.data, and
index_mem.index.tags.
5. Search:
# search_disk_index <type> <prefix> <threads> <beam_width> <query> <gt> <topk> <metric> <nbr_type> <mode> <mem_L> <Ls...>
build/tests/search_disk_index uint8 index_prefix 1 32 query.bin gt.bin 10 l2 pq 2 10 10 20 30 40Search Modes (mode):
0(DiskANN): Best-first search.1(Starling): Page-reordered search. Requires reordered index using the original Starling code and usebuild/tests/pad_partitionto align the generated partition file.2(PipeANN): Pipelined search (Recommended).3(CoroSearch): Coroutine-based inter-query parallel search.
Disable -DREAD_ONLY_TESTS and -DNO_MAPPING flags in CMakeLists.txt for update support.
1. Prepare Tags (Optional)
Each vector corresponds to one tag. PipeANN uses identity mapping (ID -> tag) by default. Use gen_tags to generate explicit mapping (necessary for FreshDiskANN).
# gen_tags <type> <data> <output_prefix>
build/tests/utils/gen_tags uint8 data.bin index_prefix2. Generate Ground-Truths for Updates
Calculating exact ground truth for every insertion step is costly. We use a tricky approach: select top-10 vectors for each interval from the top-1000 (or more) of the whole dataset (or a larger subset).
# gt_update <gt_file> <index_pts> <total_pts> <batch_pts> <topk> <output_dir> <insert_only>
# Example: Insert 100M vectors (batch=1M) into 100M index.
# truth.bin contains top-1000 for the 200M dataset.
build/tests/utils/gt_update truth.bin 100000000 200000000 1000000 10 /path/to/gt 1
# Example: Insert 100M vectors and delete the original 100M vectors.
build/tests/utils/gt_update truth.bin 100000000 200000000 1000000 10 /path/to/gt 03. Run Benchmarks
Search-Insert Workload (test_insert_search):
Inserts vectors while concurrently searching.
# Usage: test_insert_search <type> <data> <L_disk> <step_size> <steps> <ins_thds> <srch_thds> <mode> ...
build/tests/test_insert_search uint8 data_200M.bin 128 1000000 100 10 32 2 \
index_prefix query.bin /path/to/gt 0 10 4 32 10 20 30 40 50Search-Insert-Delete Workload (overall_performance):
Inserts new vectors and deletes old ones (sliding window).
# Usage: overall_performance <type> <data> <L_disk> <index> <query> <gt> <recall> <beam> <steps> <Ls...>
build/tests/overall_performance uint8 data_200M.bin 128 index_prefix query.bin \
/path/to/gt 10 4 100 20 30Notes:
- Index is not crash-consistent after updates; use
final_mergefor consistent snapshots - For update workloads, use
search_mode=2(PipeANN) withsearch_beam_width=32for best performance - In-memory index is immutable during updates but still useful for entry point optimization
PipeANN supports loading the entire SSD index into DRAM to use as an in-memory baseline (e.g., Vamana).
Search-Only (search_disk_index_mem)
Usage is identical to search_disk_index, but loads the index to memory first.
# search_disk_index_mem <type> <prefix> <threads> <beam_width> <query> <gt> <topk> <metric> <nbr_type> <mode> <mem_L> <Ls...>
build/tests/search_disk_index_mem uint8 index_prefix 1 32 query.bin gt.bin 10 l2 pq 2 10 10 20 30 40Search-Insert-Delete (overall_perf_mem)
Usage is identical to overall_performance, but operates entirely in memory.
# overall_perf_mem <type> <data> <L_disk> <index> <query> <gt> <recall> <beam> <steps> <Ls...>
build/tests/overall_perf_mem uint8 data_200M.bin 128 index_prefix query.bin \
/path/to/gt 10 4 100 20 30PipeANN supports filtered search using post-filtering.
To achieve this, two new classes are introduced:
AbstractLabelclass stores the labels for each data, used for filtering.AbstractSelectorclass filters the labels given a query label and a target label (as well as the target ID), usingis_memberfunction.
We implemented some example Labels and Selectors, including spmat Label and (range) filtered Selectors.
If arbitrary label or selector is required, you could implement them by deriving from the Abstract classes.
The labels are directly stored at the end of each record, so in the graph:
- Each record contains
[ Vector | R | R neighbors | labels ]. - The total size is fixed to
max_node_len, which may be larger thanvector_size+ (1 + R) *sizeof(uint32_t). - A new metadata,
label_size, is introduced to the metadata page.
build_disk_index and pipe_search are extended to support building and searching a filtered graph index. An example for YFCC10M in NIPS'23 BigANN benchmark:
1. Build Filtered Index
Two new arguments: label_type (e.g., spmat) and label_file.
# Build index with labels
# For yfcc10M, the labels should be with filename base.metadata.10M.spmat
build/tests/build_disk_index uint8 data.bin index 64 96 32 500 112 l2 pq spmat labels.spmat2. Search with Filter
Use search_disk_index_filtered. Requires selector_type (e.g., subset) and query_label_file.
# Search with filter
# For yfcc10M, the query_labels should be with filename query.metadata.public.100K.spmat.
build/tests/search_disk_index_filtered uint8 index 16 32 query.bin gt.bin 10 l2 pq \
subset query_labels.spmat 0 0 20 50 100 200 300Example result on YFCC10M:
L I/O Width QPS AvgLat(us) P99 Lat Mean Hops Mean IOs Recall@10
==========================================================================================
20 32 8836.26 1777.56 3402.00 0.00 49.59 13.87
50 32 6357.16 2465.66 4110.00 0.00 77.99 21.00
100 32 4164.85 3758.77 5982.00 0.00 126.23 27.96
200 32 2423.22 6490.44 9512.00 0.00 223.67 36.13
300 32 1697.97 9250.71 12881.00 0.00 321.85 41.19
PipeANN/
βββ src/ # Core implementation
β βββ index.cpp # In-memory Vamana index
β βββ ssd_index.cpp # On-disk index (search-only)
β βββ search/ # Search algorithms
β β βββ pipe_search.cpp # π PipeANN search (main algorithm)
β β βββ beam_search.cpp # DiskANN best-first search
β β βββ page_search.cpp # Starling page-based search
β β βββ coro_search.cpp # Coroutine-based multi-query search
β βββ update/ # Update operations
β β βββ direct_insert.cpp # π OdinANN direct insert
β β βββ delete_merge.cpp # Delete and merge logic
β β βββ dynamic_index.cpp # Dynamic index wrapper (search-update)
β βββ utils/ # Utilities
β βββ distance.cpp # Distance computation (L2/IP/cosine)
β βββ linux_aligned_file_reader.cpp # io_uring/AIO support
β βββ ...
βββ include/ # Header files
β βββ index.h # In-memory index interface
β βββ ssd_index.h # On-disk index interface
β βββ dynamic_index.h # Dynamic index interface
β βββ filter/ # Filtered search support
βββ tests/ # Test programs & benchmarks
β βββ build_disk_index.cpp # Build on-disk index
β βββ build_memory_index.cpp # Build in-memory index
β βββ search_disk_index.cpp # Search benchmark (SSD)
β βββ search_disk_index_mem.cpp # Search benchmark (Load SSD index to RAM)
β βββ search_disk_index_filtered.cpp # Filtered search benchmark
β βββ test_insert_search.cpp # Insert-search benchmark
β βββ overall_performance.cpp # Insert-delete-search benchmark
β βββ pad_partition.cpp # Pad partition file (for Starling)
β βββ utils/ # Data utilities
βββ tests_py/ # Python examples
βββ pipeann/ # Python package
βββ scripts/ # Evaluation scripts
βββ third_party/ # Dependencies (liburing)The provided scripts in scripts/ assume a specific directory structure for datasets and indexes.
Please modify the hard-coded paths in the scripts (or create symlinks) if your environment differs.
/mnt/nvme/data/ # Dataset Directory
βββ bigann/
β βββ 100M.bbin # SIFT100M dataset
β βββ 100M_gt.bin # SIFT100M ground truth
β βββ truth.bin # SIFT1B ground truth
β βββ bigann_200M.bbin # SIFT200M (for updates)
β βββ bigann_query.bbin # SIFT query
βββ deep/
β βββ 100M.fbin # DEEP100M dataset
β βββ 100M_gt.bin # DEEP100M ground truth
β βββ queries.fbin # DEEP query
βββ SPACEV1B/
βββ 100M.bin # SPACEV100M dataset
βββ 100M_gt.bin # SPACEV100M ground truth
βββ query.bin # SPACEV query
βββ truth.bin # SPACEV1B ground truth
/mnt/nvme2/indices/ # Search-Only Indexes
βββ bigann/100m # SIFT100M index prefix
βββ deep/100M # DEEP100M index prefix
βββ spacev/100M # SPACEV100M index prefix
/mnt/nvme/indices_upd/ # Search-Update Indexes
βββ bigann/100M # SIFT100M index for updates
βββ bigann_gnd_insert/ # GT for insert-search workload
βββ bigann_gnd/ # GT for insert-delete-search workloadThe scripts are designed to reproduce the figures in our papers.
Note: Before running, ensure your data paths match the Directory Structure above, or edit the scripts (
eval_f.sh,fig*.sh) to point to your locations.
scripts/
βββ tests-pipeann/ # PipeANN (OSDI'25) evaluation
β βββ hello_world.sh # Quick functionality test
β βββ fig11.sh ~ fig18.sh # Paper figure reproduction
β βββ plotting.py # Generate figures
β βββ plotting.ipynb # Jupyter notebook for plotting
βββ tests-odinann/ # OdinANN (FAST'26) evaluation
β βββ hello_world.sh # Quick functionality test
β βββ fig6.sh ~ fig12.sh # Paper figure reproduction
β βββ plotting.ipynb # Jupyter notebook for plotting
βββ run_all_pipeann.sh # Run all PipeANN experiments
βββ validate_index_structure.py # Index validation tool
Hello World (verify installation):
# PipeANN search-only test (~1 min)
bash scripts/tests-pipeann/hello_world.sh
# OdinANN update test (~1 min)
bash scripts/tests-odinann/hello_world.shRun individual experiments:
-
PipeANN (Search-Only):
fig11.sh: Latency vs Recall (100M datasets)fig12.sh: Throughput vs Recall (100M datasets)fig13.sh: Latency breakdownfig14.sh~fig18.sh: Other evaluations (ablation, scalability, etc.)
-
OdinANN (Search-Update):
fig6.sh: Insert-search on SIFT100M (~4d)fig7.sh: Insert-search on DEEP100M (~4d)fig8.sh: Insert-search on SIFT1B (~8d)fig12.sh: Insert-delete-search (~6d)
Plot results:
cd scripts/tests-pipeann && python plotting.py
# Or use Jupyter: plotting.ipynbIf you use PipeANN in your research, please cite our papers:
@inproceedings{fast26odinann,
author = {Hao Guo and Youyou Lu},
title = {OdinANN: Direct Insert for Consistently Stable Performance
in Billion-Scale Graph-Based Vector Search},
booktitle = {24th USENIX Conference on File and Storage Technologies (FAST 26)},
year = {2026},
address = {Santa Clara, CA},
publisher = {USENIX Association}
}
@inproceedings{osdi25pipeann,
author = {Hao Guo and Youyou Lu},
title = {Achieving Low-Latency Graph-Based Vector Search via
Aligning Best-First Search Algorithm with SSD},
booktitle = {19th USENIX Symposium on Operating Systems Design and Implementation (OSDI 25)},
year = {2025},
address = {Boston, MA},
pages = {171--186},
publisher = {USENIX Association}
}PipeANN is based on DiskANN and FreshDiskANN. We sincerely appreciate their excellent work!