compassmock/docs/architecture/data-layer.md
Nicholai a7494397f2
docs(all): comprehensive documentation overhaul (#57)
Restructure docs/ into architecture/, modules/, and
development/ directories. Add thorough documentation
for Compass Core platform and HPS Compass modules.
Rewrite CLAUDE.md as a lean quick-reference that
points to the full docs. Rename files to lowercase,
consolidate old docs, add gotchas section.

Co-authored-by: Nicholai <nicholaivogelfilms@gmail.com>
2026-02-07 19:17:37 -07:00

200 lines
8.2 KiB
Markdown
Executable File

Data Layer
===
Compass stores everything in Cloudflare D1, which is SQLite at the edge. The ORM is Drizzle, configured in `drizzle.config.ts` to read from 8 schema files. This document covers the schema design, how the database is accessed, and why D1 over Postgres.
why D1
---
The short answer: co-location. Compass runs as a Cloudflare Worker, and D1 is SQLite running in the same data center as the worker. Queries take single-digit milliseconds because there's no network round-trip to a database server. For a project management tool where every page load fires 3-5 queries, this matters.
The longer answer involves tradeoffs. D1 is SQLite, which means:
- No native JSON column type (we store JSON as text and parse in application code)
- No `RETURNING` clause on inserts (we generate IDs before inserting)
- No concurrent writes from multiple connections (fine for Workers, where each request gets its own isolate)
- No `ENUM` types (we use text columns with TypeScript union types)
What we get in return: zero configuration, no connection pooling, no cold start penalty, automatic replication, and a database that survives the worker being evicted. For an application that reads more than it writes and runs at the edge, the tradeoffs are favorable.
schema organization
---
The schema is split across 8 files, registered in `drizzle.config.ts`:
```typescript
// drizzle.config.ts
export default defineConfig({
schema: [
"./src/db/schema.ts",
"./src/db/schema-netsuite.ts",
"./src/db/schema-plugins.ts",
"./src/db/schema-agent.ts",
"./src/db/schema-ai-config.ts",
"./src/db/schema-theme.ts",
"./src/db/schema-google.ts",
"./src/db/schema-dashboards.ts",
],
out: "./drizzle",
dialect: "sqlite",
})
```
The split isn't arbitrary. Each file maps to a feature boundary:
**`schema.ts`** -- core tables. This is the foundation that everything else references.
- `users` -- WorkOS user IDs as primary keys, roles (admin/office/field/client), optional `google_email` for Drive impersonation
- `organizations`, `organizationMembers` -- multi-tenant org structure
- `teams`, `teamMembers` -- teams within orgs
- `groups`, `groupMembers` -- cross-org permission groups
- `projects`, `projectMembers` -- the central domain entity
- `scheduleTasks`, `taskDependencies` -- Gantt chart data
- `workdayExceptions`, `scheduleBaselines` -- calendar and snapshot tracking
- `customers`, `vendors` -- CRM basics, with optional `netsuite_id` columns
- `feedback`, `feedbackInterviews` -- user feedback and UX research
- `agentConversations`, `agentMemories` -- chat persistence
- `slabMemories` -- persistent memory (preferences, workflows, facts, decisions)
- `pushTokens` -- native app push notification tokens
**`schema-netsuite.ts`** -- NetSuite integration tables.
- `netsuiteAuth` -- encrypted OAuth tokens
- `netsuiteSyncMetadata` -- per-record sync tracking (status, conflicts, retries)
- `netsuiteSyncLog` -- sync run history
- `invoices`, `vendorBills`, `payments`, `creditMemos` -- financial records that sync bidirectionally
**`schema-plugins.ts`** -- plugin/skills system.
- `plugins` -- installed plugin registry (name, source, capabilities, status)
- `pluginConfig` -- per-plugin key-value settings (supports encrypted values)
- `pluginEvents` -- audit log of installs, enables, disables
**`schema-agent.ts`** -- agent items (todos, notes, checklists created by the AI).
- `agentItems` -- type-polymorphic items with parent-child relationships
**`schema-ai-config.ts`** -- AI model configuration and usage tracking.
- `agentConfig` -- singleton config row (global model selection, cost ceiling)
- `userModelPreference` -- per-user model override
- `agentUsage` -- token counts and cost estimates per streamText invocation
**`schema-theme.ts`** -- visual themes.
- `customThemes` -- user-created themes (full oklch color maps stored as JSON)
- `userThemePreference` -- which theme each user has active
**`schema-google.ts`** -- Google Drive integration.
- `googleAuth` -- encrypted service account keys per organization
- `googleStarredFiles` -- per-user starred Drive files
**`schema-dashboards.ts`** -- agent-built custom dashboards.
- `customDashboards` -- saved dashboard specs (JSON), queries, and render prompts
design conventions
---
Every table follows the same patterns:
**Text IDs.** Primary keys are text columns containing UUIDs, generated via `crypto.randomUUID()` before insert. Not auto-increment integers. This avoids sequential ID exposure and works across distributed systems.
```typescript
export const customers = sqliteTable("customers", {
id: text("id").primaryKey(),
name: text("name").notNull(),
// ...
createdAt: text("created_at").notNull(),
updatedAt: text("updated_at"),
})
```
**Text dates.** All timestamps are ISO 8601 strings stored in text columns, not integer unix timestamps. This makes them human-readable in raw queries and avoids timezone confusion. The application always generates them with `new Date().toISOString()`.
**Explicit foreign keys.** References use Drizzle's `.references()` with explicit `onDelete` behavior. Most use `cascade` for cleanup, some use no action for soft-reference relationships.
```typescript
export const organizationMembers = sqliteTable("organization_members", {
id: text("id").primaryKey(),
organizationId: text("organization_id")
.notNull()
.references(() => organizations.id, { onDelete: "cascade" }),
userId: text("user_id")
.notNull()
.references(() => users.id, { onDelete: "cascade" }),
role: text("role").notNull(),
joinedAt: text("joined_at").notNull(),
})
```
**Type exports.** Every table exports both `Select` and `Insert` types via Drizzle's `$inferSelect` and `$inferInsert`:
```typescript
export type Customer = typeof customers.$inferSelect
export type NewCustomer = typeof customers.$inferInsert
```
**JSON-as-text.** Complex data (theme definitions, dashboard specs, sync conflict data, line items) is stored as JSON-stringified text. This is a D1 constraint -- there's no native JSON column type. Parsing happens in application code.
accessing the database
---
The database is accessed through `getDb()` in `src/db/index.ts`, which wraps Drizzle's `drizzle()` function with all schema files merged:
```typescript
import { drizzle } from "drizzle-orm/d1"
import * as schema from "./schema"
import * as netsuiteSchema from "./schema-netsuite"
import * as pluginSchema from "./schema-plugins"
import * as agentSchema from "./schema-agent"
import * as aiConfigSchema from "./schema-ai-config"
import * as themeSchema from "./schema-theme"
import * as googleSchema from "./schema-google"
import * as dashboardSchema from "./schema-dashboards"
const allSchemas = {
...schema,
...netsuiteSchema,
...pluginSchema,
...agentSchema,
...aiConfigSchema,
...themeSchema,
...googleSchema,
...dashboardSchema,
}
export function getDb(d1: D1Database) {
return drizzle(d1, { schema: allSchemas })
}
```
The `D1Database` binding comes from the Cloudflare runtime. In server actions, you get it via:
```typescript
const { env } = await getCloudflareContext()
const db = getDb(env.DB)
```
The `getDb()` function is called per-request -- it creates a new Drizzle instance each time. This is intentional. Cloudflare Workers are stateless isolates; there's no persistent connection to reuse. The cost of creating the Drizzle wrapper is negligible compared to the query itself.
migration workflow
---
Migrations are generated by Drizzle Kit and stored in the `drizzle/` directory. The workflow:
1. Edit schema files in `src/db/`
2. Run `bun run db:generate` -- Drizzle Kit diffs the schema against the last migration and generates a new SQL migration file
3. Run `bun run db:migrate:local` -- applies migrations to the local D1 database (via Wrangler)
4. Run `bun run db:migrate:prod` -- applies migrations to the production D1 database
Migrations are append-only. Never modify an existing migration file. If a migration is wrong, generate a new one that corrects it.
The generated SQL files are plain SQLite DDL. No Drizzle-specific runtime is needed to apply them -- Wrangler's `d1 migrations apply` handles execution directly. This means the migration system is ORM-portable: if you swapped Drizzle for something else, the existing migrations would still work.