COMPLETED Systems

VecDB-WASM

Browser-native vector database in Rust compiled to WebAssembly — HNSW + Flat indices, SIMD128 acceleration for 7 distance metrics, IndexedDB persistence, <2ms search for 10K×128D vectors at >1000 QPS, ~45KB gzipped. v0.3.0 released.

status COMPLETED
type Systems
stack Rust WebAssembly SIMD128 HNSW IndexedDB TypeScript

// DESCRIPTION

The Problem: Browser AI Apps Can't Do Vector Search Without a Server

Retrieval-Augmented Generation (RAG), semantic search, and nearest-neighbor recommendation are now table-stakes features for AI-powered web applications. All three require vector similarity search: given a query vector (a text embedding, an image feature, a user preference vector), find the K most similar vectors in a stored database. The standard solution is a server-side vector database (Pinecone, Weaviate, Chroma), but this creates a hard dependency: every search requires a network round-trip, adding 50–200ms latency and making the application unusable offline.

For browser-native AI applications — extensions, PWAs, local-first apps, privacy-preserving search — the data never leaves the user's machine. A server-side vector database is not just slow; it is architecturally incompatible. What is needed is a vector database that runs entirely inside the browser, persists data locally, and performs searches in under 2ms.

问题:RAG、语义搜索和最近邻推荐已成为AI Web应用的标配,都需要向量相似度搜索。标准解决方案是服务器端向量数据库(Pinecone、Weaviate、Chroma),但这产生硬依赖:每次搜索需要网络往返(50-200ms延迟),离线不可用。对于浏览器原生AI应用——插件、PWA、本地优先应用——数据不离开用户设备,需要完全在浏览器内运行、本地持久化且2ms内完成搜索的向量数据库。

Situation & Task: Rust + WebAssembly + SIMD128

VecDB-WASM is built in Rust and compiled to WebAssembly with SIMD128 instructions enabled. Rust's zero-cost abstractions make it possible to write high-performance HNSW graph operations without garbage collection pauses or memory fragmentation. WebAssembly SIMD128 (the W3C standard for SIMD in browsers) enables 128-bit vector operations across all major browsers (Chrome, Firefox, Safari, Edge) without any special flags or plugins.

The v0.3.0 release ships as an npm package with TypeScript bindings, approximately 45KB gzipped, and a comprehensive test suite. IndexedDB integration enables persistence across browser sessions without a backend.

任务:使用Rust编写并编译为启用SIMD128指令的WebAssembly。Rust零开销抽象实现无GC暂停的高性能HNSW图操作。WebAssembly SIMD128跨所有主流浏览器启用128位向量操作。v0.3.0作为npm包发布,约45KB(gzip),包含TypeScript绑定和IndexedDB持久化。

Technical Approach: HNSW + Flat Indices with SIMD Distance

HNSW Index (Hierarchical Navigable Small World): HNSW is the state-of-the-art approximate nearest neighbor algorithm, used internally by Pinecone, Weaviate, and Faiss. It constructs a multi-layer graph where each node is connected to its nearest neighbors at multiple granularities. Search starts at the top (coarse) layer and descends to the bottom (fine) layer, exploring only a small fraction of vectors while maintaining high recall (>95% at ef=64). The Rust implementation uses wasm-bindgen-compatible data structures that serialize efficiently to/from ArrayBuffer for IndexedDB storage.

Flat Index: For small collections (<1,000 vectors) or exact search requirements, a Flat index performs exhaustive comparison. The flat index is 100% recall by definition and benchmarks show it outperforms HNSW for collections under ~500 vectors due to HNSW's graph traversal overhead.

SIMD128 Distance Computation: All 7 distance metrics are implemented using WebAssembly SIMD128 intrinsics for 4-way f32 or 2-way f64 parallelism. The 7 metrics are: Euclidean (L2), Cosine, Inner Product, Manhattan (L1), Chebyshev (L∞), Hamming (for binary vectors), and Jaccard. SIMD128 provides a 2–4x speedup over scalar implementations for the distance computation hot loop, which dominates both HNSW graph traversal and flat search runtime.

IndexedDB Persistence: The index graph and vector store are serialized using a custom binary format (avoiding JSON overhead for large vector arrays) and stored in IndexedDB via the browser's async storage API. Incremental persistence (only dirty nodes are re-serialized) keeps write latency low even for large indices.

TypeScript Bindings: wasm-bindgen generates TypeScript-compatible bindings with full type safety. The API surface is minimal: `new VecDB(dim, metric)`, `insert(id, vector)`, `search(query, k, ef)`, `save()`, `load()`. Users do not need to know WebAssembly exists.

技术方案:HNSW索引(多层图,>95%召回率@ef=64)+ 平坦索引(精确搜索,<500向量时优于HNSW)。SIMD128距离计算:7种度量(欧氏/余弦/内积/曼哈顿/切比雪夫/汉明/Jaccard)使用WASM SIMD128实现2-4x加速。IndexedDB持久化:自定义二进制格式(非JSON),增量持久化降低写入延迟。TypeScript绑定:wasm-bindgen生成类型安全API(VecDB, insert, search, save, load)。

Results: Sub-2ms Search at 1000+ QPS

All benchmarks measured in Chrome 120 on a MacBook Pro M2:

  • Search latency, 10K vectors, 128 dimensions, HNSW: <2ms (median 1.4ms, p99 1.9ms)
  • Throughput: >1,000 QPS sustained (1,200 QPS median)
  • SIMD128 speedup: 2–4x vs scalar distance computation (3.1x for Euclidean at 128D)
  • Memory footprint: ~6MB for 10K vectors at 128 dimensions (float32) including HNSW graph overhead
  • Package size: ~45KB gzipped
  • Recall: 97.2% at ef=64, 99.1% at ef=128

The combination of <2ms latency and >1,000 QPS makes VecDB-WASM suitable for interactive autocomplete, real-time semantic filtering, and client-side RAG retrieval where user-perceived responsiveness is critical. The 45KB footprint ensures it does not meaningfully impact page load time even on slow connections.

结果(Chrome 120, MacBook Pro M2):10K×128D HNSW搜索延迟<2ms(中位1.4ms, p99 1.9ms);持续吞吐量>1000 QPS(中位1200 QPS);SIMD128加速2-4x(128D欧氏距离3.1x);内存占用~6MB(10K×128D含HNSW图开销);包大小~45KB(gzip);召回率97.2%@ef=64, 99.1%@ef=128。

Impact: Enabling Local-First Vector Search

VecDB-WASM enables a class of browser applications that were previously impossible without a server: offline-capable semantic search, privacy-preserving recommendation (vectors never leave the device), and zero-latency retrieval for browser extensions. The v0.3.0 release with npm packaging makes it immediately usable in any JavaScript/TypeScript project without build system changes.

影响:VecDB-WASM使此前无服务器无法实现的浏览器应用成为可能:离线语义搜索、隐私保护推荐(向量永不离开设备)、零延迟浏览器插件检索。v0.3.0 npm包可直接集成到任意JS/TS项目。

// HIGHLIGHTS

  • <2ms search latency for 10K×128D vectors in-browser (median 1.4ms, p99 1.9ms)
  • >1,000 QPS sustained throughput (1,200 QPS median on M2 MacBook Pro)
  • 2–4x SIMD128 speedup for distance computation across 7 metrics (Euclidean, Cosine, Inner Product, Manhattan, Chebyshev, Hamming, Jaccard)
  • HNSW index: 97.2% recall@ef=64, 99.1%@ef=128 — comparable to server-side Faiss at browser scale
  • ~45KB gzipped package; ~6MB memory for 10K×128D including graph overhead
  • IndexedDB persistence with custom binary serialization — survives browser restarts
  • TypeScript bindings via wasm-bindgen — users need no knowledge of WebAssembly
  • v0.3.0 released as npm package; Flat index for exact search on small collections