Vector intelligence · on demand

Data Brokerage.
Sanitized. Indexed.
AI Ready.p99 < 8ms

We vectorize high-signal corpora — filings, documentation, directories — and ship them as precision-built indexes. Download a self-hosted HNSW index, or query our hosted vector store over a single API.

1.4B+ vectors indexed 99.6% recall @10 HNSW graph index

Live catalog

Datasets, pre-vectorized
and search-ready.

Each dataset is cleaned, chunked, embedded and indexed. Plug it into your RAG stack the moment you get access — no pipeline to build.

Indexed 001

Newsletter Intelligence

scribem8.com

Curated newsletter and editorial corpus — semantic search across writers, topics and trends.

12.8M
chunks
1536
dims
cosine
metric
Indexed 002

OEM Documentation

oedocs.com

Original equipment manuals, spec sheets and service docs — vectorized for technical retrieval at scale.

34.1M
chunks
1536
dims
cosine
metric
Indexed 003

AI Agents Directory

638labs.com

A live registry of production AI agents — searchable by capability, interface and deployment surface.

2.3M
chunks
1024
dims
dot
metric
The pipeline

From raw corpus to a query you can trust.

01 / VECTORIZE

Vectorize

Source data is cleaned, deduplicated and chunked, then embedded with domain-tuned models into dense vectors.

02 / INDEX

Index

Vectors are assembled into a hierarchical navigable small-world (HNSW) graph, tuned for recall and latency.

03 / QUERY

Query

Hit it from your stack — self-hosted or via our API. Approximate nearest-neighbor search returns in single-digit milliseconds.

Why vectorized

The hard part is already done.

Skip the pipeline

No scraping, cleaning, chunking, embedding or index-tuning. We ship the finished artifact, versioned and ready.

Built for latency

HNSW graphs tuned per dataset for high recall at single-digit-millisecond search — at billions of vectors.

Your stack, your rules

Self-host the index on MongoDB, pgvector or your engine of choice — or let us host it. Same data, your call.

Freshness built in

Datasets are re-embedded and re-indexed on a schedule. Pull a new version or stream deltas — never stale.

Two ways to consume

Own the index, or query the endpoint.

Mode 01 — Self-host

Prebuilt HNSW indexes

Download a ready-built index plus backing-DB bundle. Drop it next to your app and search locally — your data never leaves your perimeter.

  • Versioned index artifacts + MongoDB / pgvector bundles
  • Air-gapped friendly — runs fully offline
  • Perpetual or subscription licensing
# pull a versioned index bundle
$ nd pull oedocs/hnsw@v3 --db mongodb
→ 34.1M vectors · 1536d · ready
Mode 02 — Hosted

Vector store + API

Skip the infra entirely. Query our managed vector store over a single authenticated endpoint and scale on demand.

  • One REST/gRPC endpoint, SDKs included
  • Usage-based pricing, no index to maintain
  • Always on the latest dataset version
// nearest-neighbor query
POST /v1/datasets/oedocs/search
{ "query": embed(q), "k": 10 }
→ 200 OK · 7ms · top-10 matches
Trust & security

Provenance you can audit.
Access you control.

Every dataset carries its lineage. Every query is governed. Self-host for full data residency, or run on isolated, encrypted infrastructure.

SOC 2 TYPE II GDPR SSO / SAML AT-REST AES-256 VPC PEERING

Data residency by design

Self-hosted indexes never leave your environment. Hosted runs in isolated, region-pinned tenancy.

Auditable lineage

Source, embedding model, version and checksum are attached to every dataset and every result.

Scoped, revocable access

Per-dataset API keys, SSO, role-based access and full query logging for compliance.

Request access

Put precision data behind your AI.

Tell us what you're building. We'll set you up with a dataset, an index, or an API key — and a short demo to get you querying fast.

No spam. We reply within one business day.
Request received — we'll be in touch shortly.