TurboVec - Google's open-source new vector indexing algorithm - AiBoss

What is turbovec?

turbovec is an open-source implementation of the Google Research TurboQuant algorithm, written in Rust and with accompanying Python bindings. It's a high-performance vector indexing library for RAG scenarios. The tool employs a data-independent quantization strategy, compressing 10 million float32 vectors from 31GB to approximately 4GB without a training phase. On ARM and x86 platforms, it achieves faster search speeds than FAISS through a hand-written SIMD kernel, and supports filtering during the search, persistence, and plug-and-play replacement with mainstream frameworks.

Main functions of turbovec

Online Intake IndexAdding vectors automatically completes the indexing process, eliminating the need for training steps, parameter tuning, or reconstruction as the corpus grows.
Quick SIMD SearchHandwritten NEON (ARM) and AVX-512BW (x86) kernels, with a search speed faster than FAISS IndexPQFastScan.
Filtering during searchSupports inputting ID whitelists or slot bitmasks; the filtering logic is directly short-circuited within the SIMD kernel, eliminating the need for excessive data scraping.
Stable external IDs and deletion:pass IdMapIndex Supports custom uint64 external IDs and O(1) time complexity for deletion by ID.
Index persistence:support write Save to disk and load Quick recovery, no recoding required.
Plug and play frameIt provides official integrations for LangChain, LlamaIndex, Haystack, and Agno, allowing you to replace existing vector storage by simply changing a few lines of import statements.
pure local operationNo hosting service is required, and the data does not leave the local machine or VPC, enabling the construction of a completely offline RAG stack.

TurboVec's technical principles

Normalization: Strip the length (norm) of each vector and store it separately, so that the remaining part becomes the direction vector on the unit hypersphere.
Random orthogonal rotationRotate all vectors using the same random orthogonal matrix so that each coordinate independently follows a predictable Beta distribution, independent of the original data content.
TQ+ Adaptive CalibrationWhen adding vectors for the first time, each coordinate is scaled and translated according to the 5%/95th quantile to map the empirical distribution to the standard Beta marginal distribution; subsequent vectors reuse this calibration parameter without retraining.
Lloyd-Max scalar quantizationBased on the known distribution, the optimal quantization bucket boundary and centroid are pre-calculated, and the distortion is close to 2.7 times the lower bound of information theory.
Bit PackagingBy compressing each coordinate into a small integer and packing them tightly, a 1536-dimensional vector was reduced from 6144 bytes to 384 bytes, achieving a 16-fold compression.
Length Renormalization ScoreAn additional scaling factor is calculated during encoding and multiplied back during search to eliminate the underestimation of the systematic inner product caused by quantization, thus correcting the estimator from biased to unbiased and further improving the recall.

How to use turbovec

Installation Library:implement pip install turbovec Get Python bindings.
Create an indexInstantiation TurboQuantIndex(dim=1536, bit_width=4)Specify the dimension and the number of quantization bits.
Add vector: call index.add(vectors) The system automatically performs rotation, calibration, and quantization after the data is imported in batches.
Perform search: call index.search(query, k=10) Get the Top-K similarity score and index.
Persistent storage:use index.write("my_index.tv") Save the index, via TurboQuantIndex.load recover.
External ID Management: Change IdMapIndex accomplish add_with_ids And O(1) are deleted.
Hybrid SearchFirst, use SQL/BM25 or similar systems to filter candidate IDs, then pass them in. allowlist Perform a dense, finely processed procedure.

TurboVec's project address

Project official websitehttps://pypi.org/project/turbovec/
GitHub repositoryhttps://github.com/RyanCodrai/turbovec

Comparison of turbovec with similar competing products

Dimension	turbovec	FAISS (IndexPQFastScan)
Quantitative training	No training required, online intake	Requires k-means training codebook
Compression ratio	16x (2-bit) / 8x (4-bit)	Similarly, it depends on the quality of training.
ARM search speed	10–19% faster than FastScan	benchmark
x86 search speed	4-bit wins, 2-bit is close.	Benchmark, 2-bit VBMI has advantages
Filtering during search	SIMD kernel short circuit, zero recall loss	Calculate first, then filter; this leads to over-scraping.
Deployment Form	Pure native embedded library	Pure native embedded library
Framework integration	Official support for 4 frameworks including LangChain	Extensive community support
Low-dimensional recall	TQ+ is on par with or ahead after calibration.	benchmark

Application scenarios of turbovec

Memory-sensitive RAGFor small and medium-sized teams or local deployments, it is necessary to index tens of millions of documents in limited memory, and compress 10 million vectors from 31GB to 4GB.
Low-latency online servicesOnline RAG, recommendation, or search systems with stringent latency requirements for vector retrieval rely on SIMD for acceleration.
Privacy-first architectureData cannot be uploaded to third parties or to government, enterprise, or financial settings outside the country; it operates purely locally without hosting services.
Edge and mobileARM is well optimized for running vector retrieval on mobile phones, IoT devices or embedded hardware.
Hybrid retrieval systemFirst, use SQL/BM25/permission system to coarsely filter candidate IDs, then pass them in. allowlist Perform dense vector sorting.