How much does ArangoDB's approximate search cost in accuracy and time?
Generated: 2026-04-10 19:23
When deploying a vector search system in production, you rarely compute exact nearest neighbors. Instead, databases like ArangoDB use approximate nearest neighbor (ANN) indexes that trade a small amount of accuracy for significantly faster search at scale.
But how much accuracy do you actually lose? And is the speed gain worth it?
In a companion experiment (Dense vs Token-Level Retrieval),
we established exact brute-force baselines for 14 embedding models across two multi-hop QA datasets.
This report takes the best-performing dense embedding variants and asks:
what happens when we replace exact search with ArangoDB's APPROX_NEAR_COSINE?
For each question, we run both methods side-by-side:
query_vec @ all_chunks.T (exact cosine similarity
via numpy dot product) locally in memory. No network, no approximation.Both methods receive the same query vector and return the same format (top-100 ranked chunk IDs with scores). We then compute retrieval metrics on both and measure the difference (loss). Timing is measured per-query for both methods, and we separately measure the network round-trip time (RTT) to isolate the actual index computation time from network overhead.
What fraction of relevant documents appear in the top-K? E.g., need 3 docs, found 2 in top-100 → Recall = 0.67
Did we find all relevant documents in top-K? Binary: 1 or 0. The strictest metric — missing even one means failure.
1/rank of the first relevant document found. Higher = first relevant result appears earlier.
1/rank of the last relevant document. Zero if any are missing. Measures how deep you must look to have everything.
The loss shown in this report is the percentage drop in each metric when using the approximate index compared to exact brute force. E.g., "Recall loss 5%" means the index finds 5% fewer relevant documents than exact search.
ArangoDB uses IVF (Inverted File) vector indexes powered by FAISS to perform approximate nearest neighbor search. Instead of comparing a query against every document (brute force), it partitions vectors into clusters and only searches the nearest clusters.
This is faster for large collections but introduces approximation error — relevant documents that fall into clusters not searched will be missed. The key parameters that control this speed-accuracy tradeoff are:
Number of clusters (Voronoi cells) to partition the vector space into.
FAISS recommends: 4×√N to 16×√N.
More clusters = smaller buckets = faster search but higher risk of missing neighbors.
How many clusters to search per query. Higher = slower but more accurate. nProbe=1 is fastest (only check closest cluster). nProbe=nLists equals brute force.
Round-trip latency to the ArangoDB server. Adds a fixed overhead to every query regardless of index configuration. Measured via persistent HTTP connection.
The ArangoDB collection musique_chunks stores 101,958 documents, each containing 9 dense
embedding fields of different sizes (from 256 to 3072 dimensions, produced by OpenAI text-embedding-3-small
and text-embedding-3-large). We create a separate vector index on each embedding field,
all with identical IVF parameters (nLists=3,192, nProbe=159). This lets us see how the approximation
loss varies with embedding dimensionality while keeping everything else constant.
FAISS recommended nLists range for N=101,958: [1,277 – 5,109]. We use the midpoint (3,192).
Network RTT measured at ~32ms on persistent connection.
The emb_large_3072 index is excluded — creating it with nLists=6,797 previously
caused RocksDB corruption on a primary pod; not reattempted with current configuration.
Average query time breakdown. "Index only" = APPROX total minus network RTT (~32ms). "Index vs BF" shows how many times slower the index computation is compared to local brute force (<1 means index is faster, >1 means brute force is faster).
How much accuracy does the approximate index lose compared to exact brute force? Select a metric to see the loss percentage across all embedding variants at each K.