Skip to main content

Platform Semantics (Phase 1)

HelioDB is currently a single-node system: SQLite storage, in-memory pub/sub, WebSocket JSON-RPC, and TypeScript/React clients. This page captures the behavioral contract that all backends must uphold and the known gaps on the path to multi-backend support.

Current Scope

  • Storage: SQLite (immediate durability). durable: true is a no-op for now.
  • Transport: WebSocket JSON-RPC (/ws) with client-auth message.
  • Clients: TypeScript (@heliodb/client, @heliodb/react, @heliodb/contract).
  • Deployment: Single process, single machine; no sharding or replicas.

Event Model

  • ID: event_<ULID> (monotonic, time-sortable). Generated per-node; monotonicity is per process.
  • Fields: namespace, resource, subject, event_type, data, metadata?, created_at (server timestamp).
  • Namespaces: Isolation boundary. Reads/subscribes do not cross namespaces.

Resources & Filters

  • resource is a slash-delimited path (e.g., workspace/123/threads/456).
  • * is allowed in any segment (e.g., workspace/*/threads).
  • exact: true matches only the pattern depth (wildcards still match a single segment).
  • exact: false (default) is prefix: matches the pattern and any deeper descendants.
  • Wildcards work with both exact and prefix modes.

Ordering & Cursors

  • Append visibility: log is synchronous; events are visible to reads/subscribes immediately after insert (read-your-writes).
  • Ordering: Within a resource, events are ordered by event_id (ULID). Cross-resource ordering is not defined.
  • Cursors are exclusive: cursor means “start after this ID.” For forward reads/subscribe: id > cursor.
  • Subscribe = history + live: Server fetches historical events forward from cursor (or beginning), then streams live events over in-memory broadcast.

Subscriptions

  • Unified stream: subscribe() streams both historical and live events in order. Use read() first if you need to distinguish between replay and live.
  • Pattern: read() + subscribe({ start: lastId }) is the recommended pattern for agents that need to know their history state before processing live events.
  • Fanout/matching: In-memory broadcast keyed by (namespace, pattern, exact), with wildcard matching on * segments and prefix vs exact semantics as above. No persistence; subscriptions disappear when connections drop.
  • Backpressure/lag: If a subscriber lags and drops messages from the broadcast buffer, the server replays from the last delivered ID via storage. If that catch-up fails, the stream ends.
  • Filtering: Optional event_types and subject filters applied server-side during subscription and catch-up.
  • At-least-once delivery for live streams under lag (catch-up may re-send; consumers should handle idempotently).

Reads

  • Direction: Forward by default; reverse is available for finite historical reads only.
  • Limits: Optional limit applies after cursor filtering.
  • Filters: resource (exact or prefix via include_children), subject (exact), event_types (any of set).

Auth

  • WebSocket clients must send auth first.
  • Modes:
    • JWT: Namespace/subject derived from claims; server validates signature with HELIODB_SECRET.
    • Shared secret (dev/demo): Token equals server secret; namespace/subject must be provided by client.
  • All subsequent methods execute within the authenticated namespace/subject.

Validation & Schemas

  • Client enforces schemas via Zod contracts. Server does not enforce schemas yet; it trusts the client’s data shape.
  • Identifiers (namespace, resource, event_type, subject) are validated server-side for format.

Known Gaps (Phase 1)

  • No persistent subscriptions (all live fanout is in-memory).
  • No server-side schema enforcement; no ACLs beyond namespace/subject on the token.
  • Only WebSocket; no REST/SSE/webhooks.
  • Only SQLite backend; no multi-node/sharded deployment.
  • Multi-modal payloads are stored inline as JSON today; blobs/object-store tier is not implemented yet.

Near-Term Priorities

  • Finalize semantics contract tests (shared suite every backend must pass).
  • Add additional storage backends (e.g., Postgres for portability/testing, ClickHouse for analytics) behind the same log/read/subscribe contract.
  • Introduce persistent subscriptions/queues for durable fanout.
  • Benchmarks for append latency, catch-up scans, and subscription lag handling across backends.