Vector Search

pgvector-backed embedding store, wired to your BYOS LLM provider. Build RAG, semantic search, recommendation, or any embeddings workflow without piecing together a separate vector DB and an embeddings API.

Model

Vector columns live next to your regular columns. Index them like any other column. Search them with cosine, L2, or inner product.

-- .atelier/migrations/20260620090000_add_docs_embedding.sql
CREATE TABLE docs (
  id          uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  title       text NOT NULL,
  body        text NOT NULL,
  embedding   vector(1536),
  created_at  timestamptz NOT NULL DEFAULT now()
);
 
CREATE INDEX docs_embedding_idx
  ON docs USING hnsw (embedding vector_cosine_ops);

Embed + write

The SDK calls your BYOS provider to compute embeddings — tokens never leave your device. Then writes the row.

import { atelier } from '@atelier/sdk';
 
const embedding = await atelier.embed({
  model: 'text-embedding-3-small',
  input: doc.body,
});
 
await atelier.from('docs').insert({
  title: doc.title,
  body: doc.body,
  embedding,
});

Search

const queryEmbedding = await atelier.embed({
  model: 'text-embedding-3-small',
  input: 'how do I reset my password?',
});
 
const matches = await atelier.from('docs')
  .select('id, title, body')
  .similar('embedding', queryEmbedding, {
    operator: 'cosine',
    limit: 5,
  });

For filtered search (pre-filter then ANN), compose with regular eq/in conditions:

const matches = await atelier.from('docs')
  .select('id, title')
  .eq('team_id', team.id)
  .similar('embedding', queryEmbedding, { limit: 5 });

RAG pattern

The common shape: retrieve top-N relevant docs, hand them to the model as context.

const queryEmbedding = await atelier.embed({ model, input: question });
 
const docs = await atelier.from('docs')
  .select('title, body')
  .similar('embedding', queryEmbedding, { limit: 5 });
 
const answer = await atelier.llm.generateText({
  model: 'claude-4-7-sonnet',
  system: 'Answer using only the provided context.',
  prompt: [
    'Context:',
    ...docs.map((d) => `# ${d.title}\n${d.body}`),
    '',
    `Question: ${question}`,
  ].join('\n'),
});

Inside Functions

Use ctx.vector for server-side embedding workflows (chunking, batched ingest, background re-embedding):

// .atelier/functions/reindex-docs.queue.ts
export const queue = { concurrency: 2, retries: 3 };
 
export default async function reindex(job, ctx) {
  const { docId } = job.payload;
  const { rows: [doc] } = await ctx.db.query(
    'SELECT body FROM docs WHERE id = $1', [docId],
  );
  const embedding = await ctx.llm.embed({
    model: 'text-embedding-3-small',
    input: doc.body,
  });
  await ctx.db.query(
    'UPDATE docs SET embedding = $1 WHERE id = $2',
    [embedding, docId],
  );
}

Indexes

Use HNSW for most workloads (better recall, faster build) or IVFFlat for very large tables with predictable distance distributions.

-- HNSW (default recommendation)
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops);
 
-- IVFFlat
CREATE INDEX ON docs USING ivfflat (embedding vector_cosine_ops)
  WITH (lists = 100);

Cost note

Embeddings count against your BYOS provider quota, not ours. Storage and indexing are part of Base’s database limits. See pricing.