Vector search at storage prices

S3 is the source of truth. Nodes are stateless. Vector, full-text, and hybrid search.
Fully open source.

Install the Zeppelin skill for Claude Code.
Use our hosted instance to create namespaces, store and query data!
Just ask Claude to tell you more about this skill and try everything out.

curl -sL https://zepdb.github.io/install.sh | sh

Storage

S3-Native Storage

Object storage is the source of truth. Scales to petabytes.

Vector Search

IVF-Flat with SQ8, Product Quantization, and hierarchical ANN.

Full-Text

BM25 Full-Text

Inverted indexes with tokenization, stemming, and hybrid ranking.

Filters

Bitmap Pre-Filters

RoaringBitmap indexes for sub-ms attribute filtering.

Consistency

Tunable Consistency

Strong or eventual reads. Choose per query.

Verification

Formally Verified

26 TLA+ specs prove WAL, compaction, and CAS correctness.

451 tests

26 TLA+ specs

1.2M+ states explored

Apache-2.0

Performance at scale

1M vectors, 768d, cosine

zepdb/zeppelin-benchmark

Vector Search

Warm p50

Zeppelin

~45ms

Turbopuffer

~8ms

Warm p99

Zeppelin

~120ms

Turbopuffer

~35ms

Cold

Zeppelin

~800ms

Turbopuffer

~343ms

Full-Text Search

Warm p50

Zeppelin

~20ms

Turbopuffer

~11ms

Cold

Zeppelin

~120ms

Turbopuffer

~221ms

Pricing calculator

Zeppelin = S3 storage + compute. Configure your setup and compare.

Number of vectors

10M

Vector dimensions

Metadata per vector

Cloud provider

Storage class

Region

Instance type

Number of nodes

Monthly cost

Storage (S3) $0

Compute $0

Total $0/mo

SQ8 quantization, 20% index overhead, 730 hrs/mo compute

Cost comparison

Up and running in 60 seconds

Start

docker run -p 3000:3000 \
  -e AWS_ACCESS_KEY_ID=your-key \
  -e AWS_SECRET_ACCESS_KEY=your-secret \
  -e ZEPPELIN_BUCKET=your-bucket \
  -e ZEPPELIN_REGION=us-east-1 \
  ghcr.io/zepdb/zeppelin:latest

Create a namespace

curl -X POST http://localhost:3000/v1/namespaces \
  -H "Content-Type: application/json" \
  -d '{"dimensions": 768}'

# Response includes "name": "<uuid>" — save it!

Upsert

curl -X POST http://localhost:3000/v1/namespaces/<namespace-uuid>/vectors \
  -H "Content-Type: application/json" \
  -d '{
    "vectors": [
      {
        "id": "vec-1",
        "values": [0.1, 0.2, 0.3, "..."],
        "attributes": {
          "category": "science",
          "year": 2024
        }
      }
    ]
  }'

Query

curl -X POST http://localhost:3000/v1/namespaces/<namespace-uuid>/query \
  -H "Content-Type: application/json" \
  -d '{
    "vector": [0.1, 0.2, 0.3, "..."],
    "top_k": 10,
    "filters": {
      "op": "And",
      "conditions": [
        {"op": "Eq", "field": "category", "value": "science"},
        {"op": "Gte", "field": "year", "value": 2020}
      ]
    },
    "include_attributes": true
  }'

Architecture

Stateless nodes. S3 is the single source of truth.

Write Path

Append-only WAL fragments to S3
OCC via S3 ETags on central Manifest
Monotonic fencing tokens prevent zombie writers
Deterministic sequence numbers (no clock skew)

Read Path

Merges fresh WAL data + optimized segments
Strong: parallel scan of WAL + indexed segments
Eventual: fast-path over segments only
Singleflight coalesces concurrent manifest fetches

Compaction

IVF-Flat / Hierarchical ANN with k-means + quantization
BM25 inverted indexes for full-text search
RoaringBitmap indexes for sub-ms attribute filtering
Runs in background, never blocks writes

Benefits

Instant elasticity: stateless nodes, no resharding, 1→100 in seconds
Zero-state recovery: no replication lag, Manifest is source of truth
Storage economics: S3 rates vs. NVMe; scale compute independently
Decoupled read/write: WAL available immediately; indexing is background

Tradeoffs

S3 latency floor: baseline 20-50ms; caching helps but not sub-5ms
Consistency lag: ManifestCache TTL ~500ms across nodes
Compaction health: latency degrades if WAL fragments grow unchecked
Cold-start: first queries populate LRU cache from S3

About

Anup Ghatage

Founder, Zeppelin

14 years in databases and distributed storage
Principal SWE, Salesforce
Apache BookKeeper committer
SAP HANA · Cisco WebEx · Sybase ASE
7 patents · 10+ Hackathon wins
Carnegie Mellon · University of Pune (CS)

Your vectors belong in your object storage, not someone else's proprietary system. Search should cost what storage costs.

I've spent 14 years inside distributed storage systems: Sybase ASE internals in C/C++, Cisco WebEx, SAP HANA, then building Hyperforce at Salesforce. One lesson kept coming back.

In an agent-first future, agents won't care about latency the way web apps do. They'll care about cost and simplicity. If your vectors live in S3, you pay S3 prices. Nodes become stateless. You scale by adding buckets, not by babysitting clusters.

On February 10th, 2026, I started building Zeppelin at the Claude Code Hackathon. The seed fund was $500 in Claude credits. I directed the architecture. Every line of Rust was written by Claude Opus 4.6. One human, one AI, no team, no investors.

The result: 451 tests, 26 TLA+ specs, 1.2M+ states explored, and an engine released under Apache-2.0.

Zeppelin is open source because infrastructure should be. No vendor lock-in, no opaque pricing, no proprietary formats. Your data stays in your buckets.

Smarter architecture beats faster clusters.