clawdbot-workspace/proposals/2026-02-17-ai-finance-app.md
2026-02-17 23:03:48 -05:00

537 lines
22 KiB
Markdown

# AI Agentic Finance App — Proposal Draft
**Job:** Full-Stack Developer for AI Agentic Finance App
**URL:** https://www.upwork.com/jobs/~022023834745093778483
**Budget:** $10,000 Fixed
**Client:** Joshua, Orlando FL, 5.0★, $10K spent, 8 jobs, 75% hire rate, member since Nov 2020
**Connects:** 32 (we have 60)
**Proposals:** 20-50, 1 interviewing
**Avg Bid:** $10,017
---
## Cover Letter
Hi Joshua,
Your PRD is one of the best-written job postings I've seen on Upwork. The fact that you've already separated deterministic data from the agentic AI layer tells me you understand the core problem most fintech apps get wrong: they let the AI touch the numbers. That's exactly how you get hallucinated balances and broken trust.
I build production AI agent systems — specifically the kind where LLMs need to operate safely on structured, auditable data. My recent work includes:
- **30+ MCP (Model Context Protocol) server integrations** — giving AI agents secure, typed access to external APIs (CRMs, databases, financial tools) with strict read/write separation and function calling
- **AI-powered real estate disposition platform** — Supabase + Next.js + Stripe, multi-tenant with RLS, handling live property data where accuracy is non-negotiable
- **Autonomous AI agent infrastructure** — agents that propose actions, require human confirmation, and maintain full audit trails before any write operation
Your architecture maps directly to what I build daily: deterministic source-of-truth data → AI reads it → proposes actions via function calling → user confirms → auditable write. I've implemented this exact pattern multiple times.
**What I bring:**
- Deep Supabase experience (Postgres, RLS, Edge Functions, real-time subscriptions)
- Production OpenAI function calling with guardrails (citation enforcement, hallucination prevention via schema validation)
- Plaid integration experience (webhook idempotency, transaction reconciliation)
- React Native frontend wiring (connecting pre-built components to live backends)
- RevenueCat subscription management and webhook handling
**Proposed Milestone Breakdown:**
1. **Foundation & Auth** (Week 1) — $1,500
- Supabase project setup, multi-tenant schema, RLS policies
- User auth flow, tenant isolation verification
- Database schema for all core entities with audit trail tables
2. **Plaid Integration** (Week 2) — $2,000
- Plaid Link flow in React Native
- Transaction sync with idempotent webhook handling
- Pending → posted reconciliation, duplicate suppression
- Connection health monitoring
3. **Agentic AI Layer** (Week 3) — $2,500
- OpenAI Edge Functions with function calling
- Citation-enforced responses (every number → transaction ID)
- Action proposal system (categorize, budget, bills)
- Confirmation flow integration
- Full AI interaction logging
4. **Subscriptions, Attribution & Frontend Wiring** (Week 4) — $2,000
- RevenueCat integration + webhooks
- Meta Pixel SDK events
- Connect all React Native components to Supabase
- Real-time updates, loading states, error handling
- Confirmation modals for AI actions
5. **Testing, Docs & Deployment** (Week 5) — $2,000
- Unit/integration tests for critical paths
- Full documentation (schema, functions, deployment, RLS)
- Production environment setup
- Security audit and handoff
**Total: $10,000 | Timeline: 5 weeks**
I'm available to start immediately and can collaborate daily via Slack. I'm looking for long-term partnerships exactly like this — where the first project proves the relationship and opens the door to ongoing work.
Happy to jump on a call to discuss the architecture in detail. I have questions about your Plaid environment (sandbox vs development vs production) and your current Supabase project setup that would help me hit the ground running.
Portfolio: https://portfolio.mcpengage.com
---
## Screening Question Answers
### 1. Deterministic Data Design (Immutable Transaction Correction)
A transaction from 2 weeks ago needs correction without deleting or overwriting the original. Here's how I handle this:
**Schema:**
```sql
-- Immutable transactions table (append-only)
CREATE TABLE transactions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES tenants(id),
plaid_transaction_id TEXT,
amount DECIMAL(12,2) NOT NULL,
merchant_name TEXT,
category_id UUID REFERENCES categories(id),
date DATE NOT NULL,
status TEXT NOT NULL DEFAULT 'posted', -- pending | posted | corrected
is_current BOOLEAN NOT NULL DEFAULT true,
version INT NOT NULL DEFAULT 1,
superseded_by UUID REFERENCES transactions(id),
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Every change gets an audit record
CREATE TABLE transaction_audit_log (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
transaction_id UUID NOT NULL REFERENCES transactions(id),
action TEXT NOT NULL, -- 'correction' | 'categorization' | 'split' | 'merge'
previous_version INT NOT NULL,
new_version INT NOT NULL,
changed_fields JSONB NOT NULL, -- {"amount": {"old": 45.00, "new": 42.50}}
reason TEXT, -- "Bank correction via Plaid webhook"
initiated_by TEXT NOT NULL, -- 'system' | 'user' | 'ai_confirmed'
ai_proposal_id UUID REFERENCES ai_action_proposals(id), -- if AI-initiated
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
```
**Correction flow:**
1. Original transaction (id: `abc`, version: 1) is marked `is_current = false`, `status = 'corrected'`
2. New transaction (id: `def`, version: 2) is inserted with corrected values, `superseded_by` on the original points to `def`
3. Audit log records the change with both old and new values, who initiated it, and why
4. All queries filter on `is_current = true` by default, but the full history is always available
5. A `row_hash` column (SHA-256 of amount + date + merchant + category) enables detecting retroactive changes
The original record is never deleted or modified (except `is_current` and `superseded_by` flags). This gives you a complete, verifiable history for every transaction.
---
### 2. Multi-Tenancy & RLS
When a new tenant is created, here's the exact flow:
**Step 1: Database design** — Every table has a `tenant_id` column as the first foreign key:
```sql
CREATE TABLE tenants (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL,
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE user_profiles (
id UUID PRIMARY KEY REFERENCES auth.users(id),
tenant_id UUID NOT NULL REFERENCES tenants(id),
role TEXT NOT NULL DEFAULT 'member'
);
```
**Step 2: Auth hook** — On user signup, a trigger creates or assigns tenant:
```sql
CREATE OR REPLACE FUNCTION handle_new_user()
RETURNS TRIGGER AS $$
BEGIN
INSERT INTO tenants (id, name) VALUES (NEW.id, NEW.raw_user_meta_data->>'name');
INSERT INTO user_profiles (id, tenant_id) VALUES (NEW.id, NEW.id);
RETURN NEW;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
```
**Step 3: Helper function** to get current user's tenant:
```sql
CREATE OR REPLACE FUNCTION get_tenant_id()
RETURNS UUID AS $$
SELECT tenant_id FROM user_profiles WHERE id = auth.uid()
$$ LANGUAGE sql STABLE SECURITY DEFINER;
```
**Step 4: RLS policies** on every data table:
```sql
ALTER TABLE transactions ENABLE ROW LEVEL SECURITY;
CREATE POLICY "tenant_isolation" ON transactions
FOR ALL
USING (tenant_id = get_tenant_id())
WITH CHECK (tenant_id = get_tenant_id());
```
**Step 5: Verification** — I test isolation by creating two test tenants and confirming Tenant A cannot see Tenant B's data through any query path (direct, join, function, real-time subscription). I also add an index on `tenant_id` for every table to ensure RLS doesn't kill performance.
Even though you're launching single-tenant, this foundation means adding tenants later is literally just creating a new row in the tenants table.
---
### 3. AI Hallucination Prevention + Determinism
My approach has three layers:
**Layer 1: Prompt architecture**
The system prompt explicitly constrains the AI:
```
You are a financial assistant. You ONLY answer based on the user's actual transaction data provided below.
Rules:
- Every dollar amount must reference a specific transaction_id
- If you cannot find the answer in the provided data, say "I don't have enough data to answer that"
- Never estimate, approximate, or infer amounts not present in the data
- When summing transactions, list each transaction_id included in the sum
```
The user's actual transaction data is injected into the context as structured JSON — not free text. Each transaction includes its `id`, `amount`, `merchant`, `date`, and `category`.
**Layer 2: Function calling with data gates**
Instead of putting all transactions in context (which hits token limits), I use function calling to let the AI request specific data:
```json
{
"name": "get_transactions",
"parameters": {
"date_from": "2026-01-01",
"date_to": "2026-01-31",
"category": "coffee",
"limit": 50
}
}
```
The Edge Function executes this against the deterministic database and returns actual records. The AI can only see what the database returns.
**Layer 3: Backend validation before response**
Before sending the AI's response to the user, I run a validation step:
1. Extract all transaction IDs and amounts mentioned in the response
2. Query the database to verify each transaction ID exists and the amount matches
3. Verify any sums by recomputing them from the cited transactions
4. If any citation fails validation, the response is rejected and regenerated with a stricter prompt
5. The validated response includes clickable citations that link to the actual transaction
This three-layer approach means even if the AI hallucinates (which function calling makes rare), the backend catches it before the user ever sees it.
---
### 4. Function Calling with Deterministic Safeguards (Starbucks → Coffee)
User says: "Move my Starbucks transactions from yesterday to the 'Coffee' category."
**Function definition:**
```json
{
"name": "recategorize_transactions",
"description": "Propose recategorizing transactions matching criteria to a new category",
"parameters": {
"type": "object",
"properties": {
"merchant_filter": {"type": "string", "description": "Merchant name to match"},
"date_from": {"type": "string", "format": "date"},
"date_to": {"type": "string", "format": "date"},
"new_category": {"type": "string", "description": "Target category name"}
},
"required": ["merchant_filter", "date_from", "date_to", "new_category"]
}
}
```
**Flow:**
1. **AI calls the function** with `{"merchant_filter": "Starbucks", "date_from": "2026-02-16", "date_to": "2026-02-16", "new_category": "Coffee"}`
2. **Edge Function validates against deterministic data:**
- Query: `SELECT * FROM transactions WHERE tenant_id = $1 AND merchant_name ILIKE '%Starbucks%' AND date = '2026-02-16' AND is_current = true`
- Verify 'Coffee' category exists (or offer to create it)
- Returns the exact transactions found: `[{id: "tx_123", amount: 5.75, merchant: "Starbucks Reserve"}, {id: "tx_456", amount: 12.30, merchant: "Starbucks"}]`
3. **AI formats the proposal** and returns it as a structured action, NOT a database write:
```json
{
"action_type": "recategorize",
"transactions": ["tx_123", "tx_456"],
"from_category": "Food & Drink",
"to_category": "Coffee",
"summary": "Move 2 Starbucks transactions ($18.05 total) from yesterday to Coffee"
}
```
4. **Frontend shows confirmation modal:**
"Move 2 transactions to Coffee?
- Starbucks Reserve — $5.75
- Starbucks — $12.30
[Confirm] [Cancel]"
5. **User taps Confirm → frontend calls Edge Function:**
- For each transaction: insert new version with updated category, mark old as `is_current = false`
- Create audit log entry linking to the `ai_action_proposal` record
- The AI never touched the database — the confirmed action went through a separate, validated write path
---
### 5. Plaid Webhook Reliability (Idempotency)
**Strategy: Idempotency keys + sync state machine + deduplication**
```sql
CREATE TABLE plaid_webhook_log (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
webhook_type TEXT NOT NULL,
webhook_code TEXT NOT NULL,
item_id TEXT NOT NULL,
payload JSONB NOT NULL,
idempotency_key TEXT UNIQUE NOT NULL, -- webhook_type + item_id + new_transactions hash
processed_at TIMESTAMPTZ,
status TEXT DEFAULT 'received' -- received | processing | completed | failed
);
CREATE TABLE plaid_sync_cursors (
item_id TEXT PRIMARY KEY,
cursor TEXT NOT NULL, -- Plaid's sync cursor
last_synced_at TIMESTAMPTZ NOT NULL,
tenant_id UUID NOT NULL
);
```
**Webhook handler (Edge Function):**
1. **Receive webhook** → compute idempotency key from `webhook_type + item_id + timestamp`
2. **INSERT INTO plaid_webhook_log** with `ON CONFLICT (idempotency_key) DO NOTHING` — if it returns 0 rows, this is a duplicate → return 200 immediately
3. **Use Plaid's `/transactions/sync` endpoint** (not the deprecated `/get`):
- Pass the stored cursor from `plaid_sync_cursors`
- Plaid returns `added`, `modified`, `removed` arrays plus a new cursor
- This is inherently idempotent — calling sync with the same cursor always returns the same delta
4. **Process each transaction:**
- `added`: Insert with `ON CONFLICT (plaid_transaction_id) DO NOTHING` (dedup)
- `modified`: Create new version (immutable pattern from Q1), mark old as superseded
- `removed`: Mark as `status = 'removed'`, don't delete
5. **Update cursor** only after all transactions are processed (atomic — if the function crashes mid-way, the cursor hasn't moved, so the next sync picks up where it left off)
6. **Mark webhook as `completed`**
**Out-of-order handling:** Because we use Plaid's cursor-based sync rather than processing webhook payloads directly, order doesn't matter. Each sync call gets the complete delta from the last successful cursor position.
---
### 6. RevenueCat + Supabase Sync
**Webhook handling flow:**
```sql
CREATE TABLE subscriptions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES tenants(id),
revcat_app_user_id TEXT NOT NULL,
entitlement TEXT NOT NULL, -- 'pro', 'premium'
status TEXT NOT NULL, -- 'active' | 'expired' | 'billing_retry' | 'refunded'
expires_at TIMESTAMPTZ,
original_purchase_date TIMESTAMPTZ,
updated_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE revcat_webhook_log (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
event_id TEXT UNIQUE NOT NULL, -- RevenueCat's event ID (idempotency)
event_type TEXT NOT NULL,
app_user_id TEXT NOT NULL,
payload JSONB NOT NULL,
processed_at TIMESTAMPTZ DEFAULT now()
);
```
**Flow:**
1. RevenueCat sends webhook → Edge Function receives it
2. Check `event_id` against `revcat_webhook_log` — duplicate? Return 200.
3. Map event type to subscription status:
- `INITIAL_PURCHASE` / `RENEWAL` → `active`, update `expires_at`
- `EXPIRATION` → `expired`
- `BILLING_ISSUE` → `billing_retry` (grace period)
- `CANCELLATION` → keep `active` until `expires_at`, then `expired`
- `REFUND` → `refunded`, immediately revoke entitlement
4. Upsert `subscriptions` table
5. Log to `revcat_webhook_log`
**Edge cases:**
- **Lapsed subscription:** Status goes to `billing_retry` during grace period. A cron job checks daily for subscriptions past grace period → marks as `expired`. If renewal comes later, webhook updates back to `active`.
- **Refund:** Immediately set status to `refunded`. The app checks `subscriptions.status` on every API call via RLS policy — refunded users lose access instantly.
- **Race condition:** RevenueCat can send events out of order. I use `updated_at` comparison — only apply the webhook if its timestamp is newer than the current `updated_at`. Otherwise log it but don't change state.
---
### 7. React Native Integration (Transaction List)
Given a pre-built `TransactionList` component, here's how I wire it up:
```typescript
import { useEffect, useState } from 'react';
import { supabase } from '../lib/supabase';
function useTransactions(filters?: { dateFrom?: string; category?: string }) {
const [transactions, setTransactions] = useState<Transaction[]>([]);
const [loading, setLoading] = useState(true);
const [error, setError] = useState<string | null>(null);
useEffect(() => {
// Initial fetch
const fetchTransactions = async () => {
setLoading(true);
let query = supabase
.from('transactions')
.select('*')
.eq('is_current', true)
.order('date', { ascending: false })
.limit(50);
if (filters?.dateFrom) query = query.gte('date', filters.dateFrom);
if (filters?.category) query = query.eq('category_id', filters.category);
const { data, error } = await query;
if (error) setError(error.message);
else setTransactions(data);
setLoading(false);
};
fetchTransactions();
// Real-time subscription for new transactions (via Plaid webhooks)
const channel = supabase
.channel('transactions')
.on('postgres_changes',
{ event: 'INSERT', schema: 'public', table: 'transactions', filter: 'is_current=eq.true' },
(payload) => {
setTransactions(prev => [payload.new as Transaction, ...prev]);
}
)
.on('postgres_changes',
{ event: 'UPDATE', schema: 'public', table: 'transactions' },
(payload) => {
// Handle version changes (corrections, recategorizations)
if (!payload.new.is_current) {
setTransactions(prev => prev.filter(t => t.id !== payload.new.id));
}
}
)
.subscribe();
return () => { supabase.removeChannel(channel); };
}, [filters?.dateFrom, filters?.category]);
return { transactions, loading, error };
}
// In the screen component:
function TransactionsScreen() {
const { transactions, loading, error } = useTransactions();
if (loading) return <TransactionListSkeleton />; // Skeleton matching component layout
if (error) return <ErrorState message={error} onRetry={() => {}} />;
if (!transactions.length) return <EmptyState message="No transactions yet" />;
return <TransactionList data={transactions} />;
}
```
RLS handles tenant isolation automatically — `supabase.from('transactions')` only returns the authenticated user's tenant data. Real-time subscriptions also respect RLS, so when a Plaid webhook inserts new transactions, only the owning user's app updates live.
---
### 8. Audit Trail for Agentic Actions
```sql
CREATE TABLE ai_action_proposals (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
conversation_id UUID NOT NULL REFERENCES ai_conversations(id),
-- What the AI proposed
action_type TEXT NOT NULL, -- 'recategorize' | 'set_budget' | 'add_bill' | 'split_transaction'
proposed_params JSONB NOT NULL, -- Full params the AI generated
affected_transaction_ids UUID[] NOT NULL, -- Which deterministic records are involved
-- Human decision
status TEXT NOT NULL DEFAULT 'pending', -- pending | confirmed | rejected | expired
decided_at TIMESTAMPTZ,
decided_by UUID REFERENCES auth.users(id),
-- Execution
executed_at TIMESTAMPTZ,
execution_result JSONB, -- Success/failure details
audit_log_ids UUID[], -- Links to transaction_audit_log entries created
-- Metadata
created_at TIMESTAMPTZ DEFAULT now(),
expires_at TIMESTAMPTZ DEFAULT (now() + interval '24 hours')
);
CREATE TABLE ai_conversations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
user_message TEXT NOT NULL,
ai_response TEXT NOT NULL,
function_calls JSONB, -- All function calls made during this interaction
tokens_used INT,
model TEXT, -- 'gpt-4o', etc.
created_at TIMESTAMPTZ DEFAULT now()
);
```
**Complete audit chain:**
1. User asks question → `ai_conversations` logs the full exchange
2. AI proposes action → `ai_action_proposals` records what, why, and which transactions
3. User confirms → `status` updated, `decided_at` + `decided_by` recorded
4. Execution → new transaction versions created, `transaction_audit_log` entries link back to `ai_action_proposals.id`
5. Result stored in `execution_result`
You can trace any data change from: AI conversation → proposal → user confirmation → audit log → actual data change. Nothing happens without this chain.
---
### 9. Handling Retroactive Changes
When a bank sends a corrected transaction from 2 months ago:
1. **Detection:** Plaid's `/transactions/sync` returns it in the `modified` array. My webhook handler identifies the existing transaction by `plaid_transaction_id`.
2. **Immutable correction:** Follow the same pattern from Q1 — create new version, mark old as `is_current = false`, full audit log entry with `reason: "Bank correction via Plaid sync"`.
3. **Cascade check:** Run a function that identifies downstream impacts:
- Recalculate affected budget period(s) — if the corrected amount changes a monthly budget total
- Flag any AI responses that cited the old transaction version
- Check if the correction changes any bill detection patterns
4. **User notification:** Push notification + in-app alert:
"Your bank corrected a transaction from Dec 15: Merchant X changed from $45.00 to $42.50. Your December spending summary has been updated."
5. **AI staleness prevention:**
- AI conversations older than the correction are marked with a `stale_data` flag
- If the user asks about that time period, the AI's context always pulls from `is_current = true` records — it automatically uses the corrected data
- The AI never caches or remembers previous answers — every query hits the deterministic source of truth fresh
- If the user references a previous AI answer that's now stale, the system can detect this and proactively say "Note: a transaction from this period was corrected by your bank since my last answer"
---
### 10. (Bonus) Security & Performance Notes
- All Edge Functions use `SECURITY DEFINER` sparingly — most operations run as the authenticated user with RLS
- Environment variables for all API keys (Plaid, OpenAI, RevenueCat) — never in code
- Rate limiting on AI endpoints (prevent abuse of OpenAI token costs)
- Connection pooling via Supabase's built-in PgBouncer
- Indexes on `tenant_id`, `plaid_transaction_id`, `date`, `is_current` for all hot query paths