AgentZero: Architecture Deep Dive

Version: 0.1.0
Last Updated: May 2026
Target Audience: Onboarding engineers, maintainers

end

›System Overview
›Core Layers
›Key Flows
›Data Model
›External Integrations
›Security & Access Control
›Deployment & Environment
›Appendix: Design Patterns & Trade-offs

end

System Overview

Tech Stack

Frontend & Framework

›Next.js 16.2 (Vercel's React meta-framework with Turbopack)
›React 19 (Latest features: useActionState, async Server Components)
›TypeScript 5 (Full strict mode, no implicit any)
›Tailwind CSS 4 + Shadcn UI (component library)

Backend & Authentication

›Next.js Server Actions (direct RPC from React components, no /api layer needed)
›Auth.js v5 (Credentials provider, JWT sessions, Argon2 password hashing)
›Supabase (PostgreSQL + Auth + Vectors + Realtime)

AI/ML Capabilities

›Vercel AI SDK 6 (streamText ReAct loop, tool() definitions, native AsyncIterable streaming)
›DeepSeek Chat (primary LLM model, v1 only)
›Embedding Models (pluggable: OpenAI text-embedding-3-small, Cohere embed-english-v3.0, NVIDIA NIMs)
›pgvector (PostgreSQL vector search)

Billing & Commercial

›Lemon Squeezy (subscription management, webhooks, payment processing)
›Custom Credits System (in-database rate limiting, soft limits enforced)

end

Project Goals

Phase 1 (v0.1 — Current)

›Multi-tenant SaaS foundation
›Agentic AI loop with ReAct reasoning
›Document RAG pipeline (organization-scoped knowledge base)
›Lightweight web search integration (Tavily)
›Founding/Pro tier billing with credit metering

Future Phases

›Multi-model support (Anthropic Claude, OpenAI, Cohere as alternatives)
›Advanced tool ecosystem (email, Slack, databases, APIs)
›Prompt management and versioning
›Usage analytics and per-agent monitoring
›API for third-party integrations

end

Core Layers

1. Presentation Layer

Location: app/, components/

The presentation layer is purely React 19 Client Components + Server Components. No traditional API routes for UI concerns.

Page Structure (Next.js App Router)

$  snippetread-only
app/
  page.tsx                    # Landing page
  login/page.tsx              # Login form (Server Component)
  signup/page.tsx             # Registration form (Server Component)
  dashboard/
    page.tsx                  # Main dashboard HUD
    agents/
      page.tsx                # Agent list & kanban board
      [agentId]/page.tsx      # Single agent detail
      new/page.tsx            # Agent creation form
    knowledge/page.tsx        # Document upload & management
    billing/page.tsx          # Subscription & credits view
    run/page.tsx              # Quick-launch agent executor

Component Patterns

Server Components (marked "use server" for actions):

›Handle auth checks via auth() from @/auth
›Fetch data directly from Supabase via adminClient
›Validate user orgId against all queries (RLS + application layer defense-in-depth)
›Return JSX or discriminated union types for error handling

Client Components (interactive features):

›Use hooks like useActionState (React 19) to bind to Server Actions
›Stream events from streamAgentAction via for await (event of stream)
›Render tool execution logs, markdown responses, real-time status
›Zero AI keys in browser code (all injected server-side via proxy layer)

Key Components | Component | Location | Purpose | |-----------|----------|---------| | QuickLaunchForm | dashboard/run/_components/ | Streaming agent executor with tool logs | | AgentFleet | dashboard/_components/ | Agent grid view | | UploadKnowledge | components/dashboard/ | Drag-drop PDF/text upload | | VitalsStrip | dashboard/_components/ | Credit usage + subscription status |

end

2. Server Actions Layer

Location: lib/actions/

Server Actions are the single source of truth for all data mutations and async operations. They are stateless functions that run on-demand when called from the browser or from other Server Actions.

$  snippetread-only
"use server"

export async function streamAgentAction(prompt: string): Promise<AsyncIterable<StreamEvent>> {
  const session = await auth();               // Auth guard (no DB hit if session exists)
  const parsed = runAgentSchema.safeParse(...); // Zod validation
  // ... fetch data, run AI logic
  return asyncGenerator;                      // React 19 serializes AsyncIterable natively
}

Core Action Files

| File | Responsibility | |------|-----------------| | auth-actions.ts | Login/signup flows, password hashing (Argon2), session refresh | | agent-actions.ts | Streaming agent execution (streamText ReAct loop), RAG context injection, credit deduction | | agent-crud-actions.ts | List agents, create agents, search memos (knowledge base semantic search) | | conversation-actions.ts | Create conversations, save messages, fetch history | | billing-actions.ts | Lemon Squeezy checkout URL generation | | credit-actions.ts | Check balance, deduct optimistically, rollback on error | | document-actions.ts | Upload, extract (unpdf), chunk, embed, and persist documents |

Key Patterns

›
Zod Input Validation: Every action validates inputs with z.object() and .safeParse(). No silent coercion. Errors are user-friendly messages.
›
Discriminated Unions for Errors: Actions return { ok: true; ... } | { ok: false; error: string } to make error handling explicit on the client.
›
React Cache for Deduplication: Functions like getUserByEmail() are wrapped in cache() to deduplicate identical queries within a single request. Auth handshake + Server Action in the same request → exactly one DB hit.
›
Async Per-Request Auth: await auth() uses Next.js 16's async boundary. The JWT contains userId and orgId baked in — zero DB lookups on every authenticated request (trust the JWT).

end

3. Data Layer

Location: lib/supabase/

All database access goes through Supabase client instances:

$  snippetread-only
// lib/supabase/admin.ts
export const adminClient = createClient<Database>(
  process.env.NEXT_PUBLIC_SUPABASE_URL!,
  process.env.SUPABASE_SECRET_KEY!  // Service role: bypasses RLS
);

Query Patterns

Server Actions use the adminClient (service-role key) to bypass RLS because:

›Auth is checked at the Server Action boundary (await auth())
›We filter by orgId / userId in every query (application-layer security)
›RLS policies are an additional defense layer, not the primary one

Example:

$  snippetread-only
const { data: agent } = await adminClient
  .from("agents")
  .select("id, instructions")
  .eq("id", agentId)
  .eq("organisation_id", session.user.orgId)  // Application layer check
  .single();

Caching Strategy

›fetchRagContext() in agent-actions.ts uses 'use cache' (Next.js 16 directive)
›Cache key is derived from (query, orgId, agentId) — per-agent/per-org granularity
›60-second revalidation for RAG embeddings (cost optimization for vector lookups)
›Cache is per-request by default; cacheLife() controls revalidation policy

end

4. AI Layer

Location: lib/ai/, features/tools/

The AI layer orchestrates the agentic loop: model invocation → tool detection → tool execution → response streaming.

4.1 Model Registry & Provider Factory

lib/ai/model-registry.ts: Metadata for available models

$  snippetread-only
export const MODEL_REGISTRY: ModelMetadata[] = [
  {
    id: "deepseek-chat",
    label: "DeepSeek Chat",
    provider: "deepseek",
  },
];

lib/ai/provider-factory.ts: Returns a LanguageModel instance for a given model ID

$  snippetread-only
export function getModel(modelId?: string): LanguageModel {
  return deepseek(modelId ?? "deepseek-chat");
}

Design Decision: V1 ships DeepSeek only. Multi-provider support (Claude, GPT-4, etc.) will be added post-launch based on user demand. The factory pattern makes it trivial to add providers later.

4.2 Tool System

Tool Registry (lib/ai/tool-registry.ts)

A metadata registry for UI display:

$  snippetread-only
export const TOOL_REGISTRY: ToolMetadata[] = [
  {
    id: "web_search",
    label: "Web Search",
    description: "Search the web using Tavily and return relevant results.",
    icon: Globe,
  },
];

Tool Definitions (features/tools/)

| Tool | File | Purpose | |------|------|---------| | webSearchTool | web-search.ts | Real-time web search via Tavily API | | knowledgeSearchTool | knowledge-search.ts | Semantic search over org's uploaded documents (invisible RAG, not user-toggleable) | | emailAutomateTool | email-automate.ts | Send transactional email via Resend | | dbReadTool | db-read-write.ts | Read rows from a Supabase table the agent has access to | | dbWriteTool | db-read-write.ts | Insert/upsert a single row into a Supabase table |

Each tool is an instance of Vercel AI SDK's tool() function:

$  snippetread-only
export const webSearchTool = tool({
  description: "...",
  inputSchema: z.object({ query: z.string() }),
  outputSchema: z.object({ results: z.array(...) }),
  execute: async (input) => {
    // Fetch from Tavily, validate, return
  },
});

Key Design: knowledgeSearchTool is NOT in the UI registry. It's invisible RAG plumbing that the model can invoke mid-reasoning loop to pull more context from the org's knowledge base — active retrieval in addition to pre-stream injection.

4.3 Streaming Tool Loop

The agent loop is implemented directly with streamText() from the AI SDK, not the higher-level ToolLoopAgent class. streamText() opens a ReAct loop (generate → tool call → tool result → continue) and exposes the live event stream as result.fullStream — which we wrap in an AsyncGenerator and return from a Server Action. React 19 Flight serialises AsyncIterable natively, so the client iterates events with a plain for await.

Streaming Invocation (agent-actions.ts)

$  snippetread-only
const { textStream } = await streamText({
  model: getModel(modelId),
  system: BASE_INSTRUCTIONS + agentInstructions,
  messages: [{ role: "user", content: prompt }],
  tools: { webSearchTool, knowledgeSearchTool, ... },
  stopWhen: stepCountIs(10),  // Max 10 reasoning steps per run
});

// textStream is an AsyncIterable<string>
// Wrapped into a generator that yields structured events
const eventStream = (async function* () {
  for await (const chunk of textStream) {
    yield { type: "text-delta", delta: chunk };
  }
  yield { type: "done" };
})();

Event Stream Format

The wire format for streaming is a discriminated union of events:

$  snippetread-only
export type StreamEvent =
  | { type: "text-delta";  delta: string }                        // LLM output chunks
  | { type: "tool-call";   toolName: string; toolCallId: string; input: unknown }
  | { type: "tool-result"; toolName: string; toolCallId: string; output: unknown }
  | { type: "done" }
  | { type: "error";       message: string };

Client-side streaming hook iterates this generator in real-time, rendering tool logs and response text as it arrives.

4.4 Embeddings & RAG Pipeline

lib/ai/embeddings.ts: Provider-agnostic embedding generation and semantic search

Supported Providers

| Provider | Model | Dimensions | Environment Variables | |----------|-------|-----------|----------------------| | OpenAI | text-embedding-3-small | 1536 | EMBEDDING_PROVIDER=openai | | Cohere | embed-english-v3.0 | 1024 | EMBEDDING_PROVIDER=cohere + COHERE_API_KEY | | NVIDIA NIMs | nv-embed-qa-mistral-7b-v3 | 1024 | EMBEDDING_PROVIDER=nvidia + NVIDIA_NIMS_BASE_URL + NVIDIA_NIMS_API_KEY |

⚠️ Critical: Switching providers on an existing database requires:

›Re-embedding all document chunks with the new provider
›Running the Supabase migration to alter the vector column dimension
›No mixing of dimensions in the same table

Functions

›generateEmbedding(text) — text → vector
›storeChunks(document_id, chunks, embeddings) — vectors → document_chunks table
›semanticSearch(query_embedding, orgId, client) — ranked chunks via match_chunks() RPC
›semanticSearchForAgent(query_embedding, agentId, client) — agent-scoped variant via match_agent_chunks() RPC

RAG Context Injection (fetchRagContext in agent-actions.ts)

Before the streaming loop starts:

›Embed the user's prompt
›Query document_chunks for semantic matches (similarity > threshold)
›Prepend the top chunks as context to the system prompt
›Results are cached for 60 seconds (cost optimization)

end

Key Flows

1. User Authentication Flow

$  snippetread-only
┌─────────────────────────────────────────────────┐
│ User submits email + password on /login         │
└──────────────┬──────────────────────────────────┘
               │
               ▼
        ┌──────────────────┐
        │ loginAction()    │         (Server Action)
        │ (auth-actions)   │
        └─────────┬────────┘
                  │
                  ├─→ Zod validation (email, password)
                  ├─→ headers() captures request context
                  ├─→ getUserByEmail() [cached]
                  │   └─→ Supabase adminClient.from("users").select(...)
                  │
                  ├─→ signIn("credentials") [Auth.js]
                  │   └─→ authorize() callback
                  │       ├─→ getUserByEmail() [already cached]
                  │       ├─→ argon2.verify(user.password_hash, plaintext)
                  │       └─→ Return { id, email, orgId } on success
                  │
                  ├─→ jwt callback (Auth.js)
                  │   ├─→ token.id = user.id
                  │   └─→ token.orgId = user.orgId
                  │
                  └─→ Redirect to /dashboard on success

          React cache deduplication
          ════════════════════════════════════════
          loginAction() + authorize() both call
          getUserByEmail(email). React's cache()
          ensures exactly ONE Supabase query.

Session Persistence

›JWT Strategy: Token contains id + orgId, no session DB
›Every Request: await auth() decodes JWT in ~1ms (no DB)
›Session Updates: auth.update({ orgId: newOrgId }) triggers jwt callback with trigger: "update"
›No Silent Token Refresh: Expired JWTs return null; client must re-login

end

2. Agent Creation & Configuration

$  snippetread-only
┌─────────────────────────────────────────────────┐
│ User clicks "New Agent" → /dashboard/agents/new │
└──────────────┬──────────────────────────────────┘
               │
               ▼
        ┌──────────────────────────┐
        │ createAgent()            │    (Server Action)
        │ (agent-crud-actions.ts)  │
        └─────────┬────────────────┘
                  │
                  ├─→ await auth() → session.user.orgId
                  ├─→ Generate auto-name if not provided: "Agent N"
                  ├─→ INSERT into agents table
                  │   {
                  │     name, instructions, organisation_id
                  │   }
                  │
                  └─→ Return { success: true; id } | { success: false; error }
                      (discriminated union)

Agent Scoping Rules
═══════════════════════════════════════════════════════
Every agent query filters by organisation_id:

  SELECT * FROM agents
  WHERE id = agentId AND organisation_id = session.user.orgId

This prevents cross-org data leakage even if an attacker
forges an agentId in the client. RLS + app-layer defense.

Agent Instructions

When creating an agent, users can provide custom system prompt instructions. These are stored in the agents.instructions column and merged into the system prompt at runtime:

$  snippetread-only
Base instructions + Agent instructions → LLM system prompt

end

3. Conversation Lifecycle

$  snippetread-only
┌──────────────────────────────────────────────────────┐
│ User clicks "Start Conversation" on agent detail     │
└────────────┬─────────────────────────────────────────┘
             │
             ▼
      ┌──────────────────────────┐
      │ createConversation()     │      (Server Action)
      │ (conversation-actions)   │
      └────────┬─────────────────┘
               │
               ├─→ Verify agent belongs to org
               ├─→ INSERT into conversations
               │   {
               │     agent_id, organisation_id, title
               │   }
               │
               └─→ Return { ok: true; conversationId }

                              │
                              ▼

          ┌──────────────────────────────┐
          │ User types prompt + submits   │
          │ Calls streamAgentAction()     │      (Server Action)
          │ (agent-actions.ts)            │
          └────────┬─────────────────────┘
                   │
                   ├─→ await auth() + orgId check
                   ├─→ Fetch agent.instructions
                   ├─→ Check credits balance (deductCredits)
                   │   └─→ If insufficient: return error stream
                   │
                   ├─→ Fetch RAG context (fetchRagContext)
                   │   ├─→ Generate embedding for prompt
                   │   ├─→ semanticSearch (org-scoped or agent-scoped)
                   │   └─→ Prepend chunks to system prompt
                   │
                   ├─→ streamText({
                   │     model, system, messages, tools,
                   │     stopWhen: stepCountIs(10)
                   │   })
                   │   │
                   │   └─→ streamText opens the ReAct loop:
                   │       ┌──────────────────────────┐
                   │       │ REACT LOOP               │
                   │       ├──────────────────────────┤
                   │       │ 1. Model generates text  │
                   │       │ 2. Check for tool calls  │
                   │       │ 3. Execute tool          │
                   │       │ 4. Feed result back      │
                   │       │ Repeat until done        │
                   │       └──────────────────────────┘
                   │
                   ├─→ Wrap response in AsyncGenerator<StreamEvent>
                   │   React 19 serializes natively
                   │
                   └─→ Client-side useAgentStream hook
                       ├─→ for await (event of stream)
                       ├─→ Render text-delta → <Markdown />
                       ├─→ Render tool-call → show spinner
                       ├─→ Render tool-result → show output
                       └─→ On "done", saveMessage() twice
                           (once user msg, once assistant msg)

          Credit Lifecycle
          ═════════════════════════════════════════════
          1. PRE-STREAM:     deductCredits() (optimistic)
          2. STREAM:         streamText runs the ReAct loop
          3. ERROR/TIMEOUT:  rollbackCredits() (within catch)
          4. SUCCESS:        No rollback, credits stay deducted

end

4. Document Upload & Knowledge Base Indexing

$  snippetread-only
┌──────────────────────────────────────────┐
│ User uploads PDF → /dashboard/knowledge  │
└────────┬─────────────────────────────────┘
         │
         ▼
  ┌────────────────────────┐
  │ UploadKnowledge        │    (Client Component)
  │ (drag-drop, multipart) │
  └────────┬───────────────┘
           │
           ├─→ formData.append("file", file)
           ├─→ formData.append("agentId", agentId)
           │
           ▼
  ┌───────────────────────────────┐
  │ uploadDocument()              │    (Server Action)
  │ (document-actions.ts)         │
  └────────┬──────────────────────┘
           │
           ├─→ await auth() + orgId
           ├─→ Validate file (PDF/TXT, max 10 MB)
           ├─→ INSERT into documents
           │   {
           │     organisation_id, agent_id, filename, content, status
           │   }
           │
           └─→ Trigger embedding pipeline

              (Supabase Functions trigger or direct call)
              ════════════════════════════════════════
              supabase/functions/embed-memo/index.ts

              1. Fetch document content
              2. Split into chunks (500 token window, 50 overlap)
              3. For each chunk:
                 ├─→ generateEmbedding(chunk)
                 ├─→ INSERT into document_chunks
                 │   {
                 │     document_id, content, embedding, created_at
                 │   }
                 └─→ Update documents.status = 'ready'
              4. Generate memo_summaries row (agent-visible metadata)
                 └─→ For knowledgeSearchTool queries

end

5. Billing & Credits Flow

$  snippetread-only
Subscription Purchase
═════════════════════════════════════════════════════
User clicks "Upgrade to Pro" → createCheckoutForTier("PRO")
  ├─→ Validate auth + orgId
  ├─→ createLemonSqueezyCheckout()
  │   └─→ API call to LS: POST /checkouts
  │       {
  │         data: {
  │           relationships: { variant: { data: { id: PRO_VARIANT_ID } } },
  │           attributes: {
  │             checkout_data: {
  │               custom_data: { user_id: session.user.id }
  │             }
  │           }
  │         }
  │       }
  │
  └─→ Return { ok: true; url } → redirect browser to LS checkout

User completes payment → LS sends webhook to /api/webhooks/lemonsqueezy
  │
  ├─→ Signature verification (HMAC-SHA256)
  ├─→ Extract meta.event_name + custom_data.user_id
  ├─→ claimWebhookEvent() → idempotency guard (webhook_events.body_hash)
  │
  ├─→ dispatch(event_name, payload)
  │
  ├─→ For subscription_payment_success:
  │   ├─→ upsertSubscription() → INSERT/UPDATE subscriptions table
  │   ├─→ grantProCreditsForInvoice()
  │   │   └─→ adminClient.rpc("grant_user_credits", { p_user_id, p_amount })
  │   │       (stored procedure in Postgres)
  │   │
  │   └─→ user_credits.credits_remaining += amount
  │
  └─→ Return 200 OK

Credit Metering
═════════════════════════════════════════════════════
Before streaming agent:
  1. checkUserCredits(userId)
     └─→ SELECT credits_remaining FROM user_credits WHERE user_id = userId
  2. If credits_remaining <= 0 → return 402 Payment Required
  3. deductCredits(userId, 1) 
     └─→ RPC: UPDATE user_credits SET credits_remaining -= 1, credits_used += 1

If agent run fails:
  4. rollbackCredits(userId, 1)
     └─→ RPC: UPDATE user_credits SET credits_remaining += 1, credits_used -= 1

If agent run succeeds:
  4. Credits stay deducted (no rollback)

Founding Tier (One-Time Purchase)
═════════════════════════════════════════════════════
Similar to Pro, but:
- order_created webhook triggers handleFoundingOrder()
- grant_founding_credits() atomically inserts founding_grants row + grants credits
- order_refunded webhook triggers revoke_founding_credits()
  └─→ If credits_remaining < refund_amount: flag needs_manual_review
      (ops team reconciles manually)

end

Data Model

Schema Overview

The database schema implements a multi-tenant, permission-based hierarchy:

$  snippetread-only
organisations
  ├── users (1:many)
  ├── agents (1:many)
  │   ├── conversations (1:many)
  │   │   └── conversation_messages (1:many)
  │   ├── documents (1:many)          [agent_id nullable for org-level docs]
  │   │   └── document_chunks (1:many) [pgvector embeddings]
  │   └── memo_summaries (1:many)
  │
  ├── subscriptions (1:many) [user_id → users]
  ├── founding_grants (1:many) [user_id → users]
  ├── user_credits (1:1) [user_id → users]
  │
  └── webhook_events (audit log for idempotency)

Key Tables

organisations

Root of the multi-tenant tree. Every other entity belongs to exactly one org.

| Column | Type | Notes | |--------|------|-------| | id | UUID | Primary key | | name | TEXT | Display name (e.g., "Acme Inc.") | | slug | TEXT | URL-safe identifier (unique) | | created_at | TIMESTAMPTZ | Audit |

users

Authentication identity + org membership.

| Column | Type | Notes | |--------|------|-------| | id | UUID | Primary key | | email | TEXT | Unique login identifier | | name | TEXT | Display name (optional) | | organisation_id | UUID | FK → organisations.id | | password_hash | TEXT | Argon2 hash (never plaintext) | | created_at | TIMESTAMPTZ | Audit |

agents

AI agent definitions, scoped to an org (not a user, enabling team sharing).

| Column | Type | Notes | |--------|------|-------| | id | UUID | Primary key | | name | TEXT | Display name | | instructions | TEXT | Custom system prompt (optional) | | organisation_id | UUID | FK → organisations.id | | created_at | TIMESTAMPTZ | Audit |

conversations & conversation_messages

Multi-turn conversation threads, scoped to agents.

conversations

| Column | Type | Notes | |--------|------|-------| | id | UUID | Primary key | | agent_id | UUID | FK → agents.id | | organisation_id | UUID | FK → organisations.id | | title | TEXT | Conversation name | | created_at | TIMESTAMPTZ | Audit | | updated_at | TIMESTAMPTZ | Last activity |

conversation_messages

| Column | Type | Notes | |--------|------|-------| | id | UUID | Primary key | | conversation_id | UUID | FK → conversations.id | | role | TEXT | 'user' | 'assistant' | | content | TEXT | Message text | | created_at | TIMESTAMPTZ | Audit |

documents & document_chunks

RAG knowledge base: documents are split into chunks with embeddings.

documents

| Column | Type | Notes | |--------|------|-------| | id | UUID | Primary key | | organisation_id | UUID | FK → organisations.id | | agent_id | UUID | FK → agents.id (nullable for org-level docs) | | filename | TEXT | Original filename | | content | TEXT | Full extracted text | | status | TEXT | 'pending' | 'ready' | 'failed' | | created_at | TIMESTAMPTZ | Audit |

document_chunks

| Column | Type | Notes | |--------|------|-------| | id | UUID | Primary key | | document_id | UUID | FK → documents.id | | content | TEXT | Chunk text (500 token window) | | embedding | vector(1536) | Output dimension depends on EMBEDDING_PROVIDER | | created_at | TIMESTAMPTZ | Audit |

Index: idx_document_chunks_embedding (pgvector HNSW or IVFFlat for fast similarity search)

memo_summaries (Agent-Visible Knowledge Metadata)

Lightweight table for agent reasoning: stores summaries + metadata for knowledge search results.

| Column | Type | Notes | |--------|------|-------| | id | UUID | Primary key | | agent_id | UUID | FK → agents.id | | document_id | UUID | FK → documents.id | | title | TEXT | Generated title | | summary | TEXT | Condensed content | | tags | TEXT[] | Categorization | | embedding | vector(1536) | Embedding of summary | | created_at | TIMESTAMPTZ | Audit |

user_credits

Credit metering: tracks remaining balance and cumulative usage per user.

| Column | Type | Notes | |--------|------|-------| | user_id | UUID | PK/FK → users.id | | credits_remaining | INT | Available balance | | credits_used | INT | Cumulative usage (audit) | | updated_at | TIMESTAMPTZ | Audit |

subscriptions

Lemon Squeezy subscription tracking. May have multiple rows per user (lifecycle changes).

| Column | Type | Notes | |--------|------|-------| | id | UUID | Primary key | | user_id | UUID | FK → users.id | | ls_subscription_id | TEXT | LS subscription ID (unique) | | ls_customer_id | BIGINT | LS customer ID | | ls_variant_id | BIGINT | LS variant ID | | tier | TEXT | 'pro' | 'founding' | | status | TEXT | 'on_trial' | 'active' | 'paused' | 'cancelled' | 'expired' | ... | | renews_at | TIMESTAMPTZ | Next renewal date | | needs_manual_review | BOOLEAN | Refund issue flagged for ops | | review_reason | TEXT | Why manual review needed | | created_at | TIMESTAMPTZ | Audit | | updated_at | TIMESTAMPTZ | Audit |

founding_grants

One-time founding purchase audit trail. Separate from subscriptions.

| Column | Type | Notes | |--------|------|-------| | id | UUID | Primary key | | user_id | UUID | FK → users.id | | ls_order_id | TEXT | LS order ID (unique) | | amount | INT | Credits granted | | granted_at | TIMESTAMPTZ | Audit | | revoked_at | TIMESTAMPTZ | Refund timestamp (null if active) | | needs_manual_review | BOOLEAN | Partial-refund reconciliation needed | | review_reason | TEXT | Why manual review needed |

webhook_events

Idempotency log: prevents processing the same LS webhook twice.

| Column | Type | Notes | |--------|------|-------| | id | BIGSERIAL | Primary key | | body_hash | TEXT | SHA-256 of raw request body (unique) | | event_name | TEXT | e.g., 'subscription_payment_success' | | ls_resource_id | TEXT | LS subscription/order ID | | raw_payload | JSONB | Full webhook payload | | received_at | TIMESTAMPTZ | Arrival time | | processed_at | TIMESTAMPTZ | Completion time | | error | TEXT | Error message if processing failed |

Row Level Security (RLS) Policies

All tables have RLS enabled. The adminClient (service-role key) bypasses RLS automatically.

Application-Layer Defense

Since all queries use the service-role client, RLS is a secondary defense layer. The primary protection is application-level:

$  snippetread-only
// Every query filters by orgId
const { data: agent } = await adminClient
  .from("agents")
  .select("*")
  .eq("organisation_id", session.user.orgId)  // Application check
  .eq("id", agentId);

This prevents cross-org leakage even if an attacker compromises a Server Action or forges an agentId.

end

External Integrations

Supabase

Role: PostgreSQL database, auth (optional—we use Auth.js), vector search, Realtime (future).

Opaque Key Pattern (2026)

Modern Supabase projects use opaque keys instead of old ANON_KEY / SERVICE_ROLE_KEY:

| Key Type | Value | Usage | |----------|-------|-------| | Publishable | sb_publishable_f136VfAlWjpVSNDQp5Bpqg_TddAha6N | Client-side Supabase JS SDK (not used here; we use Server Actions) | | Secret | sb_secret_b2JOpQDTHv6EFLn02J2THw_QFM3Eklh | Server-side adminClient (bypasses RLS) |

Setup

›Create a Supabase project at https://supabase.com/dashboard
›
Run migrations (stored in supabase/migrations/)
```
$  snippetread-only
npx supabase db push
```
›Set env vars: NEXT_PUBLIC_SUPABASE_URL, SUPABASE_SECRET_KEY

Migrations

| Migration | Purpose | |-----------|---------| | 20260325000000_multi_tenant_schema.sql | Base: organisations, users, agents | | 20260411000000_create_documents.sql | documents + document_chunks tables (RAG baseline) | | 20260412000000_agent_scoped_documents.sql | Adds agent_id to documents + match_agent_chunks() RPC | | 20260418000000_lemonsqueezy_billing.sql | Subscriptions + webhook_events + grant_user_credits() RPC | | 20260418000001_founding_grants.sql | founding_grants + grant_founding_credits() + revoke_founding_credits() RPCs | | 20260425000000_agent_conversations.sql | Conversations + message history | | 20260425000001_memo_summaries.sql | Memory store + match_memo_summaries() RPC | | 20260516000000_waitlist.sql | Waitlist signups |

end

Lemon Squeezy (Billing)

Role: Payment processing, subscription management, webhook notifications.

Setup

›Create account at https://lemonsqueezy.com
›Create a Store + Product + Variants (Pro tier, Founding tier)

›Set env vars:

$  snippetread-only
LEMONSQUEEZY_API_KEY=<api_key>
LEMONSQUEEZY_WEBHOOK_SECRET=<webhook_signing_secret>
LEMONSQUEEZY_STORE_ID=<numeric_id>
LEMONSQUEEZY_PRO_VARIANT_ID=<numeric_id>
LEMONSQUEEZY_FOUNDING_VARIANT_ID=<numeric_id>

Webhook Integration

Lemon Squeezy sends webhooks to /api/webhooks/lemonsqueezy when:

›User purchases a subscription (subscription_payment_success)
›User purchases founding tier (order_created)
›Refund is processed (subscription_payment_refunded, order_refunded)
›Subscription status changes (subscription_updated, subscription_cancelled, etc.)

Webhook Handler (app/api/webhooks/lemonsqueezy/route.ts)

$  snippetread-only
// Signature verification (HMAC-SHA256)
// ↓
// Idempotency guard (webhook_events.body_hash)
// ↓
// dispatch(event_name, payload)
// ├─→ upsertSubscription() [subscription events]
// ├─→ grantProCreditsForInvoice() [payment_success]
// ├─→ revokeProCreditsForInvoice() [payment_refunded]
// ├─→ handleFoundingOrder() [order_created]
// └─→ handleFoundingRefund() [order_refunded]
// ↓
// markWebhookProcessed()

Error Handling

›If handler throws, webhook is marked failed and LS retries indefinitely
›needs_manual_review flag is set if refund cannot be applied cleanly (e.g., user already spent the credits)
›Ops dashboard monitors subscriptions.needs_manual_review = TRUE

end

DeepSeek (AI Model)

Role: Primary LLM for the agentic loop.

Setup

›Get API key from https://platform.deepseek.com
›Set env var: DEEPSEEK_API_KEY (read by @ai-sdk/deepseek)

Model ID: deepseek-chat

Design Decision: V1 ships with DeepSeek only. Claude, GPT-4, and others will be added post-launch.

end

Tavily (Web Search)

Role: Real-time web search tool for the agent.

Setup

›Create account at https://tavily.com
›Set env var: TAVILY_API_KEY

Tool: webSearchTool in features/tools/web-search.ts

Usage: Called by agent when user prompt needs current information not in the knowledge base.

end

Embedding Providers

Pluggable embedding model selection via environment variable.

OpenAI (Default)

$  snippetread-only
EMBEDDING_PROVIDER=openai
# Uses text-embedding-3-small (1536 dimensions)
# No additional env vars needed (uses OPENAI_API_KEY if available)

Cohere (Free tier)

$  snippetread-only
EMBEDDING_PROVIDER=cohere
COHERE_API_KEY=<api_key>
# Uses embed-english-v3.0 (1024 dimensions)

NVIDIA NIMs (Enterprise)

$  snippetread-only
EMBEDDING_PROVIDER=nvidia
NVIDIA_NIMS_BASE_URL=https://integrate.api.nvidia.com/v1
NVIDIA_NIMS_API_KEY=<api_key>
# Uses nv-embed-qa-mistral-7b-v3 (1024 dimensions)

⚠️ Dimension Mismatch Warning: Changing providers on an existing database requires re-embedding + migration.

end

Security & Access Control

Authentication Strategy

Auth.js v5 with Credentials Provider

›Email + password login only (v1)
›Passwords are Argon2-hashed before storage
›JWT sessions, no session database (stateless)
›userId + orgId baked into JWT on sign-in
›Subsequent requests trust the JWT (zero DB lookups on auth)

Key Pattern: await auth() returns session instantly (JWT decode, no I/O).

end

Authorization: Org-Based Multi-Tenancy

Rule: Every user belongs to exactly one org. Every agent/document/conversation belongs to exactly one org.

Enforcement

›Auth Boundary: await auth() provides session.user.orgId
›Query Filtering: Every Supabase query includes .eq("organisation_id", orgId)
›RLS Secondary Layer: Policies lock down tables, but app-level checks are primary

Example: Agent Access

$  snippetread-only
const session = await auth();
const { data: agent } = await adminClient
  .from("agents")
  .select("*")
  .eq("organisation_id", session.user.orgId)  // ← App-level filter
  .eq("id", agentId)
  .single();

if (!agent) return error("Unauthorised");

Even if an attacker forges agentId, they cannot access agents from other orgs.

end

API Key Management

Server-Side Only

All sensitive API keys live in environment variables and are never sent to the browser:

›DEEPSEEK_API_KEY (LLM)
›TAVILY_API_KEY (web search)
›COHERE_API_KEY (embeddings)
›SUPABASE_SECRET_KEY (database)
›LEMONSQUEEZY_API_KEY (billing)

The proxy layer (proxy.ts) intercepts server-to-external-API calls and injects keys. The browser never sees them.

end

Webhook Signature Verification

Lemon Squeezy Webhooks

Every webhook is signed with HMAC-SHA256:

$  snippetread-only
function verifySignature(rawBody: string, header: string | null): boolean {
  const expected = createHmac("sha256", lsEnv.LEMONSQUEEZY_WEBHOOK_SECRET)
    .update(rawBody)
    .digest();
  const received = Buffer.from(header, "hex");
  return timingSafeEqual(received, expected);
}

Process

›Verify signature first (before any JSON parsing)
›Parse JSON
›Check idempotency (webhook_events.body_hash)
›Process event
›Mark as processed

Design: body_hash is SHA-256 of raw request body. Retries send identical bytes → same hash → rejected as duplicate.

end

Input Validation

Zod 4 No-Coercion Strategy

Every Server Action validates inputs with Zod .safeParse():

$  snippetread-only
const parsed = runAgentSchema.safeParse({ prompt, modelId, ... });
if (!parsed.success) {
  return errorStream(`Invalid input: ${parsed.error.message}`);
}

No Silent Coercion: Invalid types throw errors rather than being converted. User gets explicit feedback.

Strict Mode: Tool input schemas use .strict() to reject unknown fields.

$  snippetread-only
const inputSchema = z.object({
  query: z.string().min(1),
}).strict();  // ← rejects { query: "...", extra: "field" }

end

Deployment & Environment

Required Environment Variables

Supabase

$  snippetread-only
NEXT_PUBLIC_SUPABASE_URL=https://...supabase.co
NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY=sb_publishable_...  # Not used (Server Actions only)
SUPABASE_SECRET_KEY=sb_secret_...                        # Service-role client

Auth.js

$  snippetread-only
AUTH_SECRET=<generated_via_npx_auth_secret>

App Configuration

$  snippetread-only
NEXT_PUBLIC_APP_URL=https://boileragent.dev  # For Lemon Squeezy redirects
NEXT_PUBLIC_SITE_NAME=AgentZero

Model & Embedding

$  snippetread-only
DEEPSEEK_API_KEY=<api_key>
EMBEDDING_PROVIDER=openai  # or cohere, nvidia
COHERE_API_KEY=<api_key>    # If using Cohere
NVIDIA_NIMS_BASE_URL=...   # If using NVIDIA NIMs
NVIDIA_NIMS_API_KEY=...

Tools

$  snippetread-only
TAVILY_API_KEY=<api_key>
RESEND_API_KEY=<api_key>    # For email (future)

Billing

$  snippetread-only
LEMONSQUEEZY_API_KEY=<api_key>
LEMONSQUEEZY_WEBHOOK_SECRET=<webhook_secret>
LEMONSQUEEZY_STORE_ID=<numeric_id>
LEMONSQUEEZY_PRO_VARIANT_ID=<numeric_id>
LEMONSQUEEZY_FOUNDING_VARIANT_ID=<numeric_id>

Optional

$  snippetread-only
RAG_MATCH_THRESHOLD=0.1     # Similarity threshold for semantic search (default 0.1)
NVIDIA_API_KEY=...          # NVIDIA API for title generation (future)

end

Local Development

$  snippetread-only
# Clone repo
git clone https://github.com/...

# Install deps
npm install

# Set up .env.local (copy .env.example, fill in values)
cp .env.example .env.local

# Start Supabase locally (optional, for quick iteration)
npx supabase start

# Run Next.js dev server
npm run dev

# Open http://localhost:3000

end

Production Deployment

Vercel (Recommended)

›Connect GitHub repo to Vercel
›Set environment variables in Vercel dashboard
›Deploy: git push origin main

Self-Hosted (Docker)

$  snippetread-only
FROM node:20-alpine
WORKDIR /app
COPY . .
RUN npm ci
RUN npm run build
EXPOSE 3000
CMD ["npm", "start"]

$  snippetread-only
docker build -t agentzero .
docker run -e NEXT_PUBLIC_SUPABASE_URL=... -e SUPABASE_SECRET_KEY=... -p 3000:3000 agentzero

end

Database Migrations

Running Migrations

$  snippetread-only
# Push migrations to a Supabase project
npx supabase db push

# Or manually apply SQL files to your Postgres instance
psql -U postgres -d your_db -f supabase/migrations/20260325000000_multi_tenant_schema.sql

end

Appendix: Design Patterns & Trade-offs

1. Server Actions Over API Routes

Choice: Use Server Actions exclusively; no /api/chat or /api/agents routes.

Rationale

›DX: Seamless React integration via useActionState, no fetch boilerplate
›Type Safety: TypeScript types flow directly from Server Action return type to client
›React 19 Native: AsyncIterable<T> serialization is built-in, no ai/rsc wrapper needed
›Simpler Auth: No need to manually extract JWT from headers (middleware handles it)
›One Language: Reduces cognitive load (not toggling between REST conventions and React code)

Trade-off: Cannot use standard REST tooling (curl, Postman) for debugging. Compensate with server-side logging.

end

2. JWT Sessions Without Session Database

Choice: Stateless JWT tokens with userId and orgId baked in.

Rationale

›Zero DB Lookups on Auth: await auth() decodes JWT in ~1ms, no Supabase roundtrip
›Stateless Scaling: Vercel serverless can scale without shared session store
›Simple Token Refresh: Not implemented; expired tokens require re-login (acceptable for v1)

Trade-off: Cannot revoke tokens before expiry (future: add a blacklist table if needed).

end

3. Service-Role Client + Application-Layer Filtering

Choice: Use adminClient (service-role key) everywhere; rely on application code to filter by orgId.

Rationale

›Simpler Code: No dual-path logic (one client path vs. user-scoped RLS path)
›RLS as Secondary Defense: RLS policies exist for defense-in-depth
›Easier Testing: Can test auth logic in isolation without mocking RLS

Trade-off: Must be extremely disciplined about filtering by orgId. A single missed filter is a data leak.

end

4. Optimistic Credit Deduction + Rollback

Choice: Deduct credits before streaming; rollback on error.

Rationale

›Prevents Credit Drain on Refunds: If we roll back a failed run, the user doesn't lose credits
›Clear Audit Trail: credits_used tracks consumption; credits_remaining tracks balance
›Refund Edge Case: If refund amount > current balance, flag needs_manual_review (ops team reconciles)

Trade-off: If the rollback RPC fails (unlikely but possible), we must log loudly and ops must reconcile manually.

end

5. RAG Context Pre-Injection + Mid-Loop Knowledge Search

Choice: Inject top semantic chunks before streaming starts; agent can call knowledgeSearchTool mid-reasoning to fetch more.

Rationale

›Cost Optimization: Pre-injection avoids redundant embedding calls mid-loop
›Dual Retrieval: Gives agent flexibility to pull context as reasoning evolves
›Cache Hit: 'use cache' on fetchRagContext memoizes embeddings for 60 seconds

Trade-off: If document is uploaded and immediately queried, embedding may not be ready. Graceful degradation: omit context, agent can retry manually.

end

6. Agent-Scoped vs. Org-Scoped Documents

Choice: Documents can be linked to an agent (agent_id) or org (agent_id = NULL).

Rationale

›Flexibility: Team-wide knowledge base (org-level) + agent-specific context (agent-level)
›Query Routing: semanticSearchForAgent() checks agent_id first; falls back to org-level docs

Trade-off: Adds schema complexity (nullable FK, two RPC functions). Mitigated by clear naming.

end

7. Next.js 16 Partial Prerendering (PPR) + React Compiler

Choice: Enable cacheComponents: true and reactCompiler: true in next.config.ts.

Rationale

›Performance: PPR precomputes static parts, streams dynamic parts
›Compiler Optimization: React 19 compiler memoizes components, reducing re-renders

Trade-off: Both are relatively new (PPR is stable in Next.js 16, compiler is still early). Monitor for edge cases.

end

8. Zod v4 No-Coercion Validation

Choice: Use z.safeParse() with strict mode; never silently coerce.

Rationale

›Explicit Errors: User gets clear feedback if they submit malformed data
›Security: Prevents type confusion attacks (e.g., passing a string where an array is expected)
›Debugging: Stack traces point to the exact validation failure, not downstream bugs

Trade-off: Slightly more verbose error messages if input is malformed.

end

9. Embedding Provider Pluggability

Choice: Switch providers via EMBEDDING_PROVIDER env var; support OpenAI, Cohere, NVIDIA NIMs.

Rationale

›Cost Flexibility: Cohere is free; NVIDIA NIMs is self-hosted; OpenAI is production-grade
›Lock-in Avoidance: Not forced to stick with one provider long-term

Trade-off: Dimension mismatch on switching requires re-embedding + migration. Document the procedure clearly.

end

10. Lemon Squeezy Webhook Idempotency

Choice: Use webhook_events.body_hash as the idempotency key (SHA-256 of raw request body).

Rationale

›Natural Dedup: LS retries send identical bytes; same hash → reject as duplicate
›Two-Layer Idempotency: body_hash (webhook level) + founding_grants.ls_order_id (business level)

Trade-off: If webhook is processed successfully but crash happens before markWebhookProcessed(), the next delivery will be rejected. Mitigated by ensuring markWebhookProcessed() is the last operation in the handler.

end

Glossary

| Term | Definition | |------|-----------| | Org | Organization. Root entity in multi-tenant hierarchy. Users, agents, documents belong to exactly one org. | | RLS | Row-Level Security (PostgreSQL). Database-layer access control. In AgentZero, secondary defense after application filtering. | | RAG | Retrieval-Augmented Generation. Technique: embed documents → store vectors → retrieve similar chunks for LLM context. | | ReAct loop | Reasoning + Acting pattern: model generates text, checks for tool calls, executes tool, feeds result back, repeats until done. Implemented by streamText() from AI SDK 6 (with stopWhen: stepCountIs(10)). The SDK also ships a higher-level ToolLoopAgent class, but AgentZero uses streamText directly for finer control over the event stream. | | Server Action | Next.js abstraction. Async function marked "use server" that runs on the server when called from client. No network latency for developer. | | JWT | JSON Web Token. Stateless session token containing claims (userId, orgId). Verified with secret key. | | Idempotency | Property of a function: calling it multiple times with the same input produces the same result, with no side effects on retries. | | Webhook | HTTP callback. External service (Lemon Squeezy) sends data to our endpoint when events occur (payment, refund, etc.). |

end

AgentZero: Architecture Deep Dive

Version: 0.1.0
Last Updated: May 2026
Target Audience: Onboarding engineers, maintainers

end

›System Overview
›Core Layers
›Key Flows
›Data Model
›External Integrations
›Security & Access Control
›Deployment & Environment
›Appendix: Design Patterns & Trade-offs

end

System Overview

Tech Stack

Frontend & Framework

›Next.js 16.2 (Vercel's React meta-framework with Turbopack)
›React 19 (Latest features: useActionState, async Server Components)
›TypeScript 5 (Full strict mode, no implicit any)
›Tailwind CSS 4 + Shadcn UI (component library)

Backend & Authentication

›Next.js Server Actions (direct RPC from React components, no /api layer needed)
›Auth.js v5 (Credentials provider, JWT sessions, Argon2 password hashing)
›Supabase (PostgreSQL + Auth + Vectors + Realtime)

AI/ML Capabilities

›Vercel AI SDK 6 (streamText ReAct loop, tool() definitions, native AsyncIterable streaming)
›DeepSeek Chat (primary LLM model, v1 only)
›Embedding Models (pluggable: OpenAI text-embedding-3-small, Cohere embed-english-v3.0, NVIDIA NIMs)
›pgvector (PostgreSQL vector search)

Billing & Commercial

›Lemon Squeezy (subscription management, webhooks, payment processing)
›Custom Credits System (in-database rate limiting, soft limits enforced)

end

Project Goals

Phase 1 (v0.1 — Current)

›Multi-tenant SaaS foundation
›Agentic AI loop with ReAct reasoning
›Document RAG pipeline (organization-scoped knowledge base)
›Lightweight web search integration (Tavily)
›Founding/Pro tier billing with credit metering

Future Phases

›Multi-model support (Anthropic Claude, OpenAI, Cohere as alternatives)
›Advanced tool ecosystem (email, Slack, databases, APIs)
›Prompt management and versioning
›Usage analytics and per-agent monitoring
›API for third-party integrations

end

Core Layers

1. Presentation Layer

Location: app/, components/

The presentation layer is purely React 19 Client Components + Server Components. No traditional API routes for UI concerns.

Page Structure (Next.js App Router)

$  snippetread-only
app/
  page.tsx                    # Landing page
  login/page.tsx              # Login form (Server Component)
  signup/page.tsx             # Registration form (Server Component)
  dashboard/
    page.tsx                  # Main dashboard HUD
    agents/
      page.tsx                # Agent list & kanban board
      [agentId]/page.tsx      # Single agent detail
      new/page.tsx            # Agent creation form
    knowledge/page.tsx        # Document upload & management
    billing/page.tsx          # Subscription & credits view
    run/page.tsx              # Quick-launch agent executor

Component Patterns

Server Components (marked "use server" for actions):

›Handle auth checks via auth() from @/auth
›Fetch data directly from Supabase via adminClient
›Validate user orgId against all queries (RLS + application layer defense-in-depth)
›Return JSX or discriminated union types for error handling

Client Components (interactive features):

›Use hooks like useActionState (React 19) to bind to Server Actions
›Stream events from streamAgentAction via for await (event of stream)
›Render tool execution logs, markdown responses, real-time status
›Zero AI keys in browser code (all injected server-side via proxy layer)

end

2. Server Actions Layer

Location: lib/actions/

$  snippetread-only
"use server"

export async function streamAgentAction(prompt: string): Promise<AsyncIterable<StreamEvent>> {
  const session = await auth();               // Auth guard (no DB hit if session exists)
  const parsed = runAgentSchema.safeParse(...); // Zod validation
  // ... fetch data, run AI logic
  return asyncGenerator;                      // React 19 serializes AsyncIterable natively
}

Core Action Files

Key Patterns

›
Zod Input Validation: Every action validates inputs with z.object() and .safeParse(). No silent coercion. Errors are user-friendly messages.
›
Discriminated Unions for Errors: Actions return { ok: true; ... } | { ok: false; error: string } to make error handling explicit on the client.
›
React Cache for Deduplication: Functions like getUserByEmail() are wrapped in cache() to deduplicate identical queries within a single request. Auth handshake + Server Action in the same request → exactly one DB hit.
›
Async Per-Request Auth: await auth() uses Next.js 16's async boundary. The JWT contains userId and orgId baked in — zero DB lookups on every authenticated request (trust the JWT).

end

3. Data Layer

Location: lib/supabase/

All database access goes through Supabase client instances:

$  snippetread-only
// lib/supabase/admin.ts
export const adminClient = createClient<Database>(
  process.env.NEXT_PUBLIC_SUPABASE_URL!,
  process.env.SUPABASE_SECRET_KEY!  // Service role: bypasses RLS
);

Query Patterns

Server Actions use the adminClient (service-role key) to bypass RLS because:

›Auth is checked at the Server Action boundary (await auth())
›We filter by orgId / userId in every query (application-layer security)
›RLS policies are an additional defense layer, not the primary one

Example:

$  snippetread-only
const { data: agent } = await adminClient
  .from("agents")
  .select("id, instructions")
  .eq("id", agentId)
  .eq("organisation_id", session.user.orgId)  // Application layer check
  .single();

Caching Strategy

›fetchRagContext() in agent-actions.ts uses 'use cache' (Next.js 16 directive)
›Cache key is derived from (query, orgId, agentId) — per-agent/per-org granularity
›60-second revalidation for RAG embeddings (cost optimization for vector lookups)
›Cache is per-request by default; cacheLife() controls revalidation policy

end

4. AI Layer

Location: lib/ai/, features/tools/

The AI layer orchestrates the agentic loop: model invocation → tool detection → tool execution → response streaming.

4.1 Model Registry & Provider Factory

lib/ai/model-registry.ts: Metadata for available models

$  snippetread-only
export const MODEL_REGISTRY: ModelMetadata[] = [
  {
    id: "deepseek-chat",
    label: "DeepSeek Chat",
    provider: "deepseek",
  },
];

lib/ai/provider-factory.ts: Returns a LanguageModel instance for a given model ID

$  snippetread-only
export function getModel(modelId?: string): LanguageModel {
  return deepseek(modelId ?? "deepseek-chat");
}

Design Decision: V1 ships DeepSeek only. Multi-provider support (Claude, GPT-4, etc.) will be added post-launch based on user demand. The factory pattern makes it trivial to add providers later.

4.2 Tool System

Tool Registry (lib/ai/tool-registry.ts)

A metadata registry for UI display:

$  snippetread-only
export const TOOL_REGISTRY: ToolMetadata[] = [
  {
    id: "web_search",
    label: "Web Search",
    description: "Search the web using Tavily and return relevant results.",
    icon: Globe,
  },
];

Tool Definitions (features/tools/)

Each tool is an instance of Vercel AI SDK's tool() function:

$  snippetread-only
export const webSearchTool = tool({
  description: "...",
  inputSchema: z.object({ query: z.string() }),
  outputSchema: z.object({ results: z.array(...) }),
  execute: async (input) => {
    // Fetch from Tavily, validate, return
  },
});

4.3 Streaming Tool Loop

Streaming Invocation (agent-actions.ts)

$  snippetread-only
const { textStream } = await streamText({
  model: getModel(modelId),
  system: BASE_INSTRUCTIONS + agentInstructions,
  messages: [{ role: "user", content: prompt }],
  tools: { webSearchTool, knowledgeSearchTool, ... },
  stopWhen: stepCountIs(10),  // Max 10 reasoning steps per run
});

// textStream is an AsyncIterable<string>
// Wrapped into a generator that yields structured events
const eventStream = (async function* () {
  for await (const chunk of textStream) {
    yield { type: "text-delta", delta: chunk };
  }
  yield { type: "done" };
})();

Event Stream Format

The wire format for streaming is a discriminated union of events:

$  snippetread-only
export type StreamEvent =
  | { type: "text-delta";  delta: string }                        // LLM output chunks
  | { type: "tool-call";   toolName: string; toolCallId: string; input: unknown }
  | { type: "tool-result"; toolName: string; toolCallId: string; output: unknown }
  | { type: "done" }
  | { type: "error";       message: string };

Client-side streaming hook iterates this generator in real-time, rendering tool logs and response text as it arrives.

4.4 Embeddings & RAG Pipeline

lib/ai/embeddings.ts: Provider-agnostic embedding generation and semantic search

Supported Providers

⚠️ Critical: Switching providers on an existing database requires:

›Re-embedding all document chunks with the new provider
›Running the Supabase migration to alter the vector column dimension
›No mixing of dimensions in the same table

Functions

›generateEmbedding(text) — text → vector
›storeChunks(document_id, chunks, embeddings) — vectors → document_chunks table
›semanticSearch(query_embedding, orgId, client) — ranked chunks via match_chunks() RPC
›semanticSearchForAgent(query_embedding, agentId, client) — agent-scoped variant via match_agent_chunks() RPC

RAG Context Injection (fetchRagContext in agent-actions.ts)

Before the streaming loop starts:

›Embed the user's prompt
›Query document_chunks for semantic matches (similarity > threshold)
›Prepend the top chunks as context to the system prompt
›Results are cached for 60 seconds (cost optimization)

end

Key Flows

1. User Authentication Flow

$  snippetread-only
┌─────────────────────────────────────────────────┐
│ User submits email + password on /login         │
└──────────────┬──────────────────────────────────┘
               │
               ▼
        ┌──────────────────┐
        │ loginAction()    │         (Server Action)
        │ (auth-actions)   │
        └─────────┬────────┘
                  │
                  ├─→ Zod validation (email, password)
                  ├─→ headers() captures request context
                  ├─→ getUserByEmail() [cached]
                  │   └─→ Supabase adminClient.from("users").select(...)
                  │
                  ├─→ signIn("credentials") [Auth.js]
                  │   └─→ authorize() callback
                  │       ├─→ getUserByEmail() [already cached]
                  │       ├─→ argon2.verify(user.password_hash, plaintext)
                  │       └─→ Return { id, email, orgId } on success
                  │
                  ├─→ jwt callback (Auth.js)
                  │   ├─→ token.id = user.id
                  │   └─→ token.orgId = user.orgId
                  │
                  └─→ Redirect to /dashboard on success

          React cache deduplication
          ════════════════════════════════════════
          loginAction() + authorize() both call
          getUserByEmail(email). React's cache()
          ensures exactly ONE Supabase query.

Session Persistence

›JWT Strategy: Token contains id + orgId, no session DB
›Every Request: await auth() decodes JWT in ~1ms (no DB)
›Session Updates: auth.update({ orgId: newOrgId }) triggers jwt callback with trigger: "update"
›No Silent Token Refresh: Expired JWTs return null; client must re-login

end

2. Agent Creation & Configuration

$  snippetread-only
┌─────────────────────────────────────────────────┐
│ User clicks "New Agent" → /dashboard/agents/new │
└──────────────┬──────────────────────────────────┘
               │
               ▼
        ┌──────────────────────────┐
        │ createAgent()            │    (Server Action)
        │ (agent-crud-actions.ts)  │
        └─────────┬────────────────┘
                  │
                  ├─→ await auth() → session.user.orgId
                  ├─→ Generate auto-name if not provided: "Agent N"
                  ├─→ INSERT into agents table
                  │   {
                  │     name, instructions, organisation_id
                  │   }
                  │
                  └─→ Return { success: true; id } | { success: false; error }
                      (discriminated union)

Agent Scoping Rules
═══════════════════════════════════════════════════════
Every agent query filters by organisation_id:

  SELECT * FROM agents
  WHERE id = agentId AND organisation_id = session.user.orgId

This prevents cross-org data leakage even if an attacker
forges an agentId in the client. RLS + app-layer defense.

Agent Instructions

When creating an agent, users can provide custom system prompt instructions. These are stored in the agents.instructions column and merged into the system prompt at runtime:

$  snippetread-only
Base instructions + Agent instructions → LLM system prompt

end

3. Conversation Lifecycle

$  snippetread-only
┌──────────────────────────────────────────────────────┐
│ User clicks "Start Conversation" on agent detail     │
└────────────┬─────────────────────────────────────────┘
             │
             ▼
      ┌──────────────────────────┐
      │ createConversation()     │      (Server Action)
      │ (conversation-actions)   │
      └────────┬─────────────────┘
               │
               ├─→ Verify agent belongs to org
               ├─→ INSERT into conversations
               │   {
               │     agent_id, organisation_id, title
               │   }
               │
               └─→ Return { ok: true; conversationId }

                              │
                              ▼

          ┌──────────────────────────────┐
          │ User types prompt + submits   │
          │ Calls streamAgentAction()     │      (Server Action)
          │ (agent-actions.ts)            │
          └────────┬─────────────────────┘
                   │
                   ├─→ await auth() + orgId check
                   ├─→ Fetch agent.instructions
                   ├─→ Check credits balance (deductCredits)
                   │   └─→ If insufficient: return error stream
                   │
                   ├─→ Fetch RAG context (fetchRagContext)
                   │   ├─→ Generate embedding for prompt
                   │   ├─→ semanticSearch (org-scoped or agent-scoped)
                   │   └─→ Prepend chunks to system prompt
                   │
                   ├─→ streamText({
                   │     model, system, messages, tools,
                   │     stopWhen: stepCountIs(10)
                   │   })
                   │   │
                   │   └─→ streamText opens the ReAct loop:
                   │       ┌──────────────────────────┐
                   │       │ REACT LOOP               │
                   │       ├──────────────────────────┤
                   │       │ 1. Model generates text  │
                   │       │ 2. Check for tool calls  │
                   │       │ 3. Execute tool          │
                   │       │ 4. Feed result back      │
                   │       │ Repeat until done        │
                   │       └──────────────────────────┘
                   │
                   ├─→ Wrap response in AsyncGenerator<StreamEvent>
                   │   React 19 serializes natively
                   │
                   └─→ Client-side useAgentStream hook
                       ├─→ for await (event of stream)
                       ├─→ Render text-delta → <Markdown />
                       ├─→ Render tool-call → show spinner
                       ├─→ Render tool-result → show output
                       └─→ On "done", saveMessage() twice
                           (once user msg, once assistant msg)

          Credit Lifecycle
          ═════════════════════════════════════════════
          1. PRE-STREAM:     deductCredits() (optimistic)
          2. STREAM:         streamText runs the ReAct loop
          3. ERROR/TIMEOUT:  rollbackCredits() (within catch)
          4. SUCCESS:        No rollback, credits stay deducted

end

4. Document Upload & Knowledge Base Indexing

$  snippetread-only
┌──────────────────────────────────────────┐
│ User uploads PDF → /dashboard/knowledge  │
└────────┬─────────────────────────────────┘
         │
         ▼
  ┌────────────────────────┐
  │ UploadKnowledge        │    (Client Component)
  │ (drag-drop, multipart) │
  └────────┬───────────────┘
           │
           ├─→ formData.append("file", file)
           ├─→ formData.append("agentId", agentId)
           │
           ▼
  ┌───────────────────────────────┐
  │ uploadDocument()              │    (Server Action)
  │ (document-actions.ts)         │
  └────────┬──────────────────────┘
           │
           ├─→ await auth() + orgId
           ├─→ Validate file (PDF/TXT, max 10 MB)
           ├─→ INSERT into documents
           │   {
           │     organisation_id, agent_id, filename, content, status
           │   }
           │
           └─→ Trigger embedding pipeline

              (Supabase Functions trigger or direct call)
              ════════════════════════════════════════
              supabase/functions/embed-memo/index.ts

              1. Fetch document content
              2. Split into chunks (500 token window, 50 overlap)
              3. For each chunk:
                 ├─→ generateEmbedding(chunk)
                 ├─→ INSERT into document_chunks
                 │   {
                 │     document_id, content, embedding, created_at
                 │   }
                 └─→ Update documents.status = 'ready'
              4. Generate memo_summaries row (agent-visible metadata)
                 └─→ For knowledgeSearchTool queries

end

5. Billing & Credits Flow

$  snippetread-only
Subscription Purchase
═════════════════════════════════════════════════════
User clicks "Upgrade to Pro" → createCheckoutForTier("PRO")
  ├─→ Validate auth + orgId
  ├─→ createLemonSqueezyCheckout()
  │   └─→ API call to LS: POST /checkouts
  │       {
  │         data: {
  │           relationships: { variant: { data: { id: PRO_VARIANT_ID } } },
  │           attributes: {
  │             checkout_data: {
  │               custom_data: { user_id: session.user.id }
  │             }
  │           }
  │         }
  │       }
  │
  └─→ Return { ok: true; url } → redirect browser to LS checkout

User completes payment → LS sends webhook to /api/webhooks/lemonsqueezy
  │
  ├─→ Signature verification (HMAC-SHA256)
  ├─→ Extract meta.event_name + custom_data.user_id
  ├─→ claimWebhookEvent() → idempotency guard (webhook_events.body_hash)
  │
  ├─→ dispatch(event_name, payload)
  │
  ├─→ For subscription_payment_success:
  │   ├─→ upsertSubscription() → INSERT/UPDATE subscriptions table
  │   ├─→ grantProCreditsForInvoice()
  │   │   └─→ adminClient.rpc("grant_user_credits", { p_user_id, p_amount })
  │   │       (stored procedure in Postgres)
  │   │
  │   └─→ user_credits.credits_remaining += amount
  │
  └─→ Return 200 OK

Credit Metering
═════════════════════════════════════════════════════
Before streaming agent:
  1. checkUserCredits(userId)
     └─→ SELECT credits_remaining FROM user_credits WHERE user_id = userId
  2. If credits_remaining <= 0 → return 402 Payment Required
  3. deductCredits(userId, 1) 
     └─→ RPC: UPDATE user_credits SET credits_remaining -= 1, credits_used += 1

If agent run fails:
  4. rollbackCredits(userId, 1)
     └─→ RPC: UPDATE user_credits SET credits_remaining += 1, credits_used -= 1

If agent run succeeds:
  4. Credits stay deducted (no rollback)

Founding Tier (One-Time Purchase)
═════════════════════════════════════════════════════
Similar to Pro, but:
- order_created webhook triggers handleFoundingOrder()
- grant_founding_credits() atomically inserts founding_grants row + grants credits
- order_refunded webhook triggers revoke_founding_credits()
  └─→ If credits_remaining < refund_amount: flag needs_manual_review
      (ops team reconciles manually)

end

Data Model

Schema Overview

The database schema implements a multi-tenant, permission-based hierarchy:

$  snippetread-only
organisations
  ├── users (1:many)
  ├── agents (1:many)
  │   ├── conversations (1:many)
  │   │   └── conversation_messages (1:many)
  │   ├── documents (1:many)          [agent_id nullable for org-level docs]
  │   │   └── document_chunks (1:many) [pgvector embeddings]
  │   └── memo_summaries (1:many)
  │
  ├── subscriptions (1:many) [user_id → users]
  ├── founding_grants (1:many) [user_id → users]
  ├── user_credits (1:1) [user_id → users]
  │
  └── webhook_events (audit log for idempotency)

Key Tables

organisations

Root of the multi-tenant tree. Every other entity belongs to exactly one org.

users

Authentication identity + org membership.

agents

AI agent definitions, scoped to an org (not a user, enabling team sharing).

conversations & conversation_messages

Multi-turn conversation threads, scoped to agents.

conversations

conversation_messages

documents & document_chunks

RAG knowledge base: documents are split into chunks with embeddings.

documents

document_chunks

Index: idx_document_chunks_embedding (pgvector HNSW or IVFFlat for fast similarity search)

memo_summaries (Agent-Visible Knowledge Metadata)

Lightweight table for agent reasoning: stores summaries + metadata for knowledge search results.

user_credits

Credit metering: tracks remaining balance and cumulative usage per user.

subscriptions

Lemon Squeezy subscription tracking. May have multiple rows per user (lifecycle changes).

founding_grants

One-time founding purchase audit trail. Separate from subscriptions.

webhook_events

Idempotency log: prevents processing the same LS webhook twice.

Row Level Security (RLS) Policies

All tables have RLS enabled. The adminClient (service-role key) bypasses RLS automatically.

Application-Layer Defense

Since all queries use the service-role client, RLS is a secondary defense layer. The primary protection is application-level:

$  snippetread-only
// Every query filters by orgId
const { data: agent } = await adminClient
  .from("agents")
  .select("*")
  .eq("organisation_id", session.user.orgId)  // Application check
  .eq("id", agentId);

This prevents cross-org leakage even if an attacker compromises a Server Action or forges an agentId.

end

External Integrations

Supabase

Role: PostgreSQL database, auth (optional—we use Auth.js), vector search, Realtime (future).

Opaque Key Pattern (2026)

Modern Supabase projects use opaque keys instead of old ANON_KEY / SERVICE_ROLE_KEY:

Setup

›Create a Supabase project at https://supabase.com/dashboard
›
Run migrations (stored in supabase/migrations/)
```
$  snippetread-only
npx supabase db push
```
›Set env vars: NEXT_PUBLIC_SUPABASE_URL, SUPABASE_SECRET_KEY

Migrations

end

Lemon Squeezy (Billing)

Role: Payment processing, subscription management, webhook notifications.

Setup

›Create account at https://lemonsqueezy.com
›Create a Store + Product + Variants (Pro tier, Founding tier)

›Set env vars:

$  snippetread-only
LEMONSQUEEZY_API_KEY=<api_key>
LEMONSQUEEZY_WEBHOOK_SECRET=<webhook_signing_secret>
LEMONSQUEEZY_STORE_ID=<numeric_id>
LEMONSQUEEZY_PRO_VARIANT_ID=<numeric_id>
LEMONSQUEEZY_FOUNDING_VARIANT_ID=<numeric_id>

Webhook Integration

Lemon Squeezy sends webhooks to /api/webhooks/lemonsqueezy when:

›User purchases a subscription (subscription_payment_success)
›User purchases founding tier (order_created)
›Refund is processed (subscription_payment_refunded, order_refunded)
›Subscription status changes (subscription_updated, subscription_cancelled, etc.)

Webhook Handler (app/api/webhooks/lemonsqueezy/route.ts)

$  snippetread-only
// Signature verification (HMAC-SHA256)
// ↓
// Idempotency guard (webhook_events.body_hash)
// ↓
// dispatch(event_name, payload)
// ├─→ upsertSubscription() [subscription events]
// ├─→ grantProCreditsForInvoice() [payment_success]
// ├─→ revokeProCreditsForInvoice() [payment_refunded]
// ├─→ handleFoundingOrder() [order_created]
// └─→ handleFoundingRefund() [order_refunded]
// ↓
// markWebhookProcessed()

Error Handling

›If handler throws, webhook is marked failed and LS retries indefinitely
›needs_manual_review flag is set if refund cannot be applied cleanly (e.g., user already spent the credits)
›Ops dashboard monitors subscriptions.needs_manual_review = TRUE

end

DeepSeek (AI Model)

Role: Primary LLM for the agentic loop.

Setup

›Get API key from https://platform.deepseek.com
›Set env var: DEEPSEEK_API_KEY (read by @ai-sdk/deepseek)

Model ID: deepseek-chat

Design Decision: V1 ships with DeepSeek only. Claude, GPT-4, and others will be added post-launch.

end

Tavily (Web Search)

Role: Real-time web search tool for the agent.

Setup

›Create account at https://tavily.com
›Set env var: TAVILY_API_KEY

Tool: webSearchTool in features/tools/web-search.ts

Usage: Called by agent when user prompt needs current information not in the knowledge base.

end

Embedding Providers

Pluggable embedding model selection via environment variable.

OpenAI (Default)

$  snippetread-only
EMBEDDING_PROVIDER=openai
# Uses text-embedding-3-small (1536 dimensions)
# No additional env vars needed (uses OPENAI_API_KEY if available)

Cohere (Free tier)

$  snippetread-only
EMBEDDING_PROVIDER=cohere
COHERE_API_KEY=<api_key>
# Uses embed-english-v3.0 (1024 dimensions)

NVIDIA NIMs (Enterprise)

$  snippetread-only
EMBEDDING_PROVIDER=nvidia
NVIDIA_NIMS_BASE_URL=https://integrate.api.nvidia.com/v1
NVIDIA_NIMS_API_KEY=<api_key>
# Uses nv-embed-qa-mistral-7b-v3 (1024 dimensions)

⚠️ Dimension Mismatch Warning: Changing providers on an existing database requires re-embedding + migration.

end

Security & Access Control

Authentication Strategy

Auth.js v5 with Credentials Provider

›Email + password login only (v1)
›Passwords are Argon2-hashed before storage
›JWT sessions, no session database (stateless)
›userId + orgId baked into JWT on sign-in
›Subsequent requests trust the JWT (zero DB lookups on auth)

Key Pattern: await auth() returns session instantly (JWT decode, no I/O).

end

Authorization: Org-Based Multi-Tenancy

Rule: Every user belongs to exactly one org. Every agent/document/conversation belongs to exactly one org.

Enforcement

›Auth Boundary: await auth() provides session.user.orgId
›Query Filtering: Every Supabase query includes .eq("organisation_id", orgId)
›RLS Secondary Layer: Policies lock down tables, but app-level checks are primary

Example: Agent Access

$  snippetread-only
const session = await auth();
const { data: agent } = await adminClient
  .from("agents")
  .select("*")
  .eq("organisation_id", session.user.orgId)  // ← App-level filter
  .eq("id", agentId)
  .single();

if (!agent) return error("Unauthorised");

Even if an attacker forges agentId, they cannot access agents from other orgs.

end

API Key Management

Server-Side Only

All sensitive API keys live in environment variables and are never sent to the browser:

›DEEPSEEK_API_KEY (LLM)
›TAVILY_API_KEY (web search)
›COHERE_API_KEY (embeddings)
›SUPABASE_SECRET_KEY (database)
›LEMONSQUEEZY_API_KEY (billing)

The proxy layer (proxy.ts) intercepts server-to-external-API calls and injects keys. The browser never sees them.

end

Webhook Signature Verification

Lemon Squeezy Webhooks

Every webhook is signed with HMAC-SHA256:

$  snippetread-only
function verifySignature(rawBody: string, header: string | null): boolean {
  const expected = createHmac("sha256", lsEnv.LEMONSQUEEZY_WEBHOOK_SECRET)
    .update(rawBody)
    .digest();
  const received = Buffer.from(header, "hex");
  return timingSafeEqual(received, expected);
}

Process

›Verify signature first (before any JSON parsing)
›Parse JSON
›Check idempotency (webhook_events.body_hash)
›Process event
›Mark as processed

Design: body_hash is SHA-256 of raw request body. Retries send identical bytes → same hash → rejected as duplicate.

end

Input Validation

Zod 4 No-Coercion Strategy

Every Server Action validates inputs with Zod .safeParse():

$  snippetread-only
const parsed = runAgentSchema.safeParse({ prompt, modelId, ... });
if (!parsed.success) {
  return errorStream(`Invalid input: ${parsed.error.message}`);
}

No Silent Coercion: Invalid types throw errors rather than being converted. User gets explicit feedback.

Strict Mode: Tool input schemas use .strict() to reject unknown fields.

$  snippetread-only
const inputSchema = z.object({
  query: z.string().min(1),
}).strict();  // ← rejects { query: "...", extra: "field" }

end

Deployment & Environment

Required Environment Variables

Supabase

$  snippetread-only
NEXT_PUBLIC_SUPABASE_URL=https://...supabase.co
NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY=sb_publishable_...  # Not used (Server Actions only)
SUPABASE_SECRET_KEY=sb_secret_...                        # Service-role client

Auth.js

$  snippetread-only
AUTH_SECRET=<generated_via_npx_auth_secret>

App Configuration

$  snippetread-only
NEXT_PUBLIC_APP_URL=https://boileragent.dev  # For Lemon Squeezy redirects
NEXT_PUBLIC_SITE_NAME=AgentZero

Model & Embedding

$  snippetread-only
DEEPSEEK_API_KEY=<api_key>
EMBEDDING_PROVIDER=openai  # or cohere, nvidia
COHERE_API_KEY=<api_key>    # If using Cohere
NVIDIA_NIMS_BASE_URL=...   # If using NVIDIA NIMs
NVIDIA_NIMS_API_KEY=...

Tools

$  snippetread-only
TAVILY_API_KEY=<api_key>
RESEND_API_KEY=<api_key>    # For email (future)

Billing

$  snippetread-only
LEMONSQUEEZY_API_KEY=<api_key>
LEMONSQUEEZY_WEBHOOK_SECRET=<webhook_secret>
LEMONSQUEEZY_STORE_ID=<numeric_id>
LEMONSQUEEZY_PRO_VARIANT_ID=<numeric_id>
LEMONSQUEEZY_FOUNDING_VARIANT_ID=<numeric_id>

Optional

$  snippetread-only
RAG_MATCH_THRESHOLD=0.1     # Similarity threshold for semantic search (default 0.1)
NVIDIA_API_KEY=...          # NVIDIA API for title generation (future)

end

Local Development

$  snippetread-only
# Clone repo
git clone https://github.com/...

# Install deps
npm install

# Set up .env.local (copy .env.example, fill in values)
cp .env.example .env.local

# Start Supabase locally (optional, for quick iteration)
npx supabase start

# Run Next.js dev server
npm run dev

# Open http://localhost:3000

end

Production Deployment

Vercel (Recommended)

›Connect GitHub repo to Vercel
›Set environment variables in Vercel dashboard
›Deploy: git push origin main

Self-Hosted (Docker)

$  snippetread-only
FROM node:20-alpine
WORKDIR /app
COPY . .
RUN npm ci
RUN npm run build
EXPOSE 3000
CMD ["npm", "start"]

$  snippetread-only
docker build -t agentzero .
docker run -e NEXT_PUBLIC_SUPABASE_URL=... -e SUPABASE_SECRET_KEY=... -p 3000:3000 agentzero

end

Database Migrations

Running Migrations

$  snippetread-only
# Push migrations to a Supabase project
npx supabase db push

# Or manually apply SQL files to your Postgres instance
psql -U postgres -d your_db -f supabase/migrations/20260325000000_multi_tenant_schema.sql

end

Appendix: Design Patterns & Trade-offs

1. Server Actions Over API Routes

Choice: Use Server Actions exclusively; no /api/chat or /api/agents routes.

Rationale

›DX: Seamless React integration via useActionState, no fetch boilerplate
›Type Safety: TypeScript types flow directly from Server Action return type to client
›React 19 Native: AsyncIterable<T> serialization is built-in, no ai/rsc wrapper needed
›Simpler Auth: No need to manually extract JWT from headers (middleware handles it)
›One Language: Reduces cognitive load (not toggling between REST conventions and React code)

Trade-off: Cannot use standard REST tooling (curl, Postman) for debugging. Compensate with server-side logging.

end

2. JWT Sessions Without Session Database

Choice: Stateless JWT tokens with userId and orgId baked in.

Rationale

›Zero DB Lookups on Auth: await auth() decodes JWT in ~1ms, no Supabase roundtrip
›Stateless Scaling: Vercel serverless can scale without shared session store
›Simple Token Refresh: Not implemented; expired tokens require re-login (acceptable for v1)

Trade-off: Cannot revoke tokens before expiry (future: add a blacklist table if needed).

end

3. Service-Role Client + Application-Layer Filtering

Choice: Use adminClient (service-role key) everywhere; rely on application code to filter by orgId.

Rationale

›Simpler Code: No dual-path logic (one client path vs. user-scoped RLS path)
›RLS as Secondary Defense: RLS policies exist for defense-in-depth
›Easier Testing: Can test auth logic in isolation without mocking RLS

Trade-off: Must be extremely disciplined about filtering by orgId. A single missed filter is a data leak.

end

4. Optimistic Credit Deduction + Rollback

Choice: Deduct credits before streaming; rollback on error.

Rationale

›Prevents Credit Drain on Refunds: If we roll back a failed run, the user doesn't lose credits
›Clear Audit Trail: credits_used tracks consumption; credits_remaining tracks balance
›Refund Edge Case: If refund amount > current balance, flag needs_manual_review (ops team reconciles)

Trade-off: If the rollback RPC fails (unlikely but possible), we must log loudly and ops must reconcile manually.

end

5. RAG Context Pre-Injection + Mid-Loop Knowledge Search

Choice: Inject top semantic chunks before streaming starts; agent can call knowledgeSearchTool mid-reasoning to fetch more.

Rationale

›Cost Optimization: Pre-injection avoids redundant embedding calls mid-loop
›Dual Retrieval: Gives agent flexibility to pull context as reasoning evolves
›Cache Hit: 'use cache' on fetchRagContext memoizes embeddings for 60 seconds

Trade-off: If document is uploaded and immediately queried, embedding may not be ready. Graceful degradation: omit context, agent can retry manually.

end

6. Agent-Scoped vs. Org-Scoped Documents

Choice: Documents can be linked to an agent (agent_id) or org (agent_id = NULL).

Rationale

›Flexibility: Team-wide knowledge base (org-level) + agent-specific context (agent-level)
›Query Routing: semanticSearchForAgent() checks agent_id first; falls back to org-level docs

Trade-off: Adds schema complexity (nullable FK, two RPC functions). Mitigated by clear naming.

end

7. Next.js 16 Partial Prerendering (PPR) + React Compiler

Choice: Enable cacheComponents: true and reactCompiler: true in next.config.ts.

Rationale

›Performance: PPR precomputes static parts, streams dynamic parts
›Compiler Optimization: React 19 compiler memoizes components, reducing re-renders

Trade-off: Both are relatively new (PPR is stable in Next.js 16, compiler is still early). Monitor for edge cases.

end

8. Zod v4 No-Coercion Validation

Choice: Use z.safeParse() with strict mode; never silently coerce.

Rationale

›Explicit Errors: User gets clear feedback if they submit malformed data
›Security: Prevents type confusion attacks (e.g., passing a string where an array is expected)
›Debugging: Stack traces point to the exact validation failure, not downstream bugs

Trade-off: Slightly more verbose error messages if input is malformed.

end

9. Embedding Provider Pluggability

Choice: Switch providers via EMBEDDING_PROVIDER env var; support OpenAI, Cohere, NVIDIA NIMs.

Rationale

›Cost Flexibility: Cohere is free; NVIDIA NIMs is self-hosted; OpenAI is production-grade
›Lock-in Avoidance: Not forced to stick with one provider long-term

Trade-off: Dimension mismatch on switching requires re-embedding + migration. Document the procedure clearly.

end

10. Lemon Squeezy Webhook Idempotency

Choice: Use webhook_events.body_hash as the idempotency key (SHA-256 of raw request body).

Rationale

›Natural Dedup: LS retries send identical bytes; same hash → reject as duplicate
›Two-Layer Idempotency: body_hash (webhook level) + founding_grants.ls_order_id (business level)

end

Glossary

end

AgentZero: Architecture Deep Dive

Table of Contents

System Overview

Tech Stack

Project Goals

Core Layers

1. Presentation Layer

2. Server Actions Layer

3. Data Layer

4. AI Layer

4.1 Model Registry & Provider Factory

4.2 Tool System

4.3 Streaming Tool Loop

4.4 Embeddings & RAG Pipeline

Key Flows

1. User Authentication Flow

2. Agent Creation & Configuration

3. Conversation Lifecycle

4. Document Upload & Knowledge Base Indexing

5. Billing & Credits Flow

Data Model

Schema Overview

Key Tables

organisations

users

agents

conversations & conversation_messages

documents & document_chunks

memo_summaries (Agent-Visible Knowledge Metadata)

user_credits

subscriptions

founding_grants

webhook_events

Row Level Security (RLS) Policies

External Integrations

Supabase

Lemon Squeezy (Billing)

DeepSeek (AI Model)

Tavily (Web Search)

Embedding Providers

Security & Access Control

Authentication Strategy

Authorization: Org-Based Multi-Tenancy

API Key Management

Webhook Signature Verification

Input Validation

Deployment & Environment

Required Environment Variables

Local Development

Production Deployment

Database Migrations

Appendix: Design Patterns & Trade-offs

1. Server Actions Over API Routes

2. JWT Sessions Without Session Database

3. Service-Role Client + Application-Layer Filtering

4. Optimistic Credit Deduction + Rollback

5. RAG Context Pre-Injection + Mid-Loop Knowledge Search

6. Agent-Scoped vs. Org-Scoped Documents

7. Next.js 16 Partial Prerendering (PPR) + React Compiler

8. Zod v4 No-Coercion Validation

9. Embedding Provider Pluggability

10. Lemon Squeezy Webhook Idempotency

Glossary

Further Reading

AgentZero: Architecture Deep Dive

Table of Contents

System Overview

Tech Stack

Project Goals

Core Layers

1. Presentation Layer

2. Server Actions Layer

3. Data Layer

4. AI Layer

4.1 Model Registry & Provider Factory

4.2 Tool System

4.3 Streaming Tool Loop

4.4 Embeddings & RAG Pipeline

Key Flows

1. User Authentication Flow