# Floom Build Roadmap
_Drafted 2026-04-14. Target: public v1.0 in ~4-5 months solo._
_Author: Claude (Opus 4.6), working against `/root/floom-monorepo/` v0.1.0 and `/tmp/floom-wireframes/v8.html`._

---

## 0 · Executive summary

Floom is the production layer for vibe-coded AI apps: the promise is "the power of vibecoding with the safety of production systems," delivered as an open-core platform with two ICPs (creators who deploy apps, users who install them). The strategy is a marketing-first landing (Phase 1), then hardening the existing `floom-monorepo` v0.1.0 Docker image into a complete self-hostable production layer (Phase 2), then wrapping that core with a multi-tenant cloud at `floom.dev` (Phase 3), then replacing custom OAuth wiring with Composio (Phase 4), then a public v1.0 launch with real creators and users (Phase 5). Realistic solo-founder schedule: ~18 weeks end to end if Federico protects 25 focused dev hours per week and does not let f.inc programming, moving to SF in April 2026, or investor meetings eat more than 35% of the calendar.

---

## 1 · Current state (what exists today)

Repository: `/root/floom-monorepo/`, package manager pnpm 9, turborepo, version `0.1.0`.

Workspace layout:

| Package | State | Notes |
|---|---|---|
| `apps/server` | Shipped | Hono + `better-sqlite3` + `dockerode`, boots on port 3051, 8 route files, 8 services |
| `apps/web` | Shipped | Vite + Tailwind SPA (the `/p/:slug` chat UI), served statically from `apps/server` in prod |
| `packages/runtime` | Shipped | `@floom/runtime`, e2b-backed execution layer |
| `packages/cli` | Shipped stub | `@floom/cli` command-line tool, not yet usable for `floom deploy owner/repo` |
| `packages/detect` | Shipped | `@floom/detect` auto-detect runtimes and build systems |
| `packages/manifest` | Shipped | `@floom/manifest` schema + parser |
| `docker/Dockerfile` + `docker-compose.yml` | Shipped | Image published to `ghcr.io/floomhq/floom-monorepo:latest` |
| `spec/protocol.md` | Shipped | Floom Protocol spec |
| `examples/*` | Shipped | 15+ example manifests (blast-radius, bouncer, openpaper, flyfast, opensky-app, etc.) |

Backend routes that already work (`apps/server/src/routes/`):

| Route | Purpose |
|---|---|
| `/api/health` | Liveness |
| `/api/hub` | App listing + metadata |
| `/api/parse` | OpenAPI / URL parsing |
| `/api/pick` | Embeddings-based picker for "which app do I want" |
| `/api/thread` | Chat thread persistence |
| `/api/run` + `/api/:slug/run` | Execute a tool call against an app |
| `/mcp` | MCP server handshake + tool dispatch |
| `/api/deploy-waitlist` | Creator-deploy waitlist form |

Services: `docker.ts`, `embeddings.ts`, `manifest.ts`, `openapi-ingest.ts`, `parser.ts`, `proxied-runner.ts`, `runner.ts`, `seed.ts`.

DB (SQLite) already has: `apps`, `runs`, `secrets`, `chat_threads`, `chat_turns`, `embeddings`. Plus migrations for proxied-app columns (`app_type`, `base_url`, `auth_type`, `openapi_spec_url`, `openapi_spec_cached`). WAL mode, foreign keys on.

Boot sequence: seed from `db/seed.json` (15 apps), ingest `FLOOM_APPS_CONFIG` OpenAPI specs, backfill embeddings, serve.

Deployment today: `docker run -p 3051:3051 -v apps.yaml:/app/config/apps.yaml:ro ghcr.io/floomhq/floom-monorepo:latest` works. `preview.floom.dev` is live pointing at that image.

**What's missing vs the v8 wireframe** (17 screens in `/tmp/floom-wireframes/v8.html`): auth, access control (RBAC), per-app identity providers, activity logs UI, usage charts, version timeline + rollback, schedules, webhooks, app memory, secrets UI, custom domains, feedback inbox, reviews, pricing page, install flow, post-publish celebration, settings, empty/loading/error states, mobile hamburger nav. The backend has `secrets` and `runs` tables; it does not have rows for users, sessions, roles, API keys, schedules, webhooks, audit, app memory, tickets, reviews, or domain records. The /p/:slug chat UI exists but the creator dashboard at /p/:slug/dashboard is not built.

---

## 2 · The 5 phases at a glance

| # | Phase | Duration | Outcome | Unlocks |
|---|---|---|---|---|
| 1 | Landing + waitlist | 2.5 weeks | `floom.dev` is a Next.js 15 site with 17 static screens + waitlist capture | Demand capture, launch-day narrative |
| 2 | Docker OSS core | 8 weeks | `docker run floom-monorepo:v0.2.0` boots the full production layer | Creators can self-host; open-core promise is real |
| 3 | Cloud multi-tenant wrap | 6 weeks | `floom.dev` as managed platform with workspaces, SSO, audit, managed domains | Creators onboard without DevOps; cloud billing enabled |
| 4 | Composio OAuth | 2 weeks | Gmail / Notion / Slack / Stripe / HubSpot / Sheets connectable in-app | Users can trust apps that touch their data |
| 5 | Public v1.0 launch | 1.5 weeks | Show HN, LinkedIn, Twitter, f.inc demo day, press | Real creators, paying accounts, open-source stars |

Total elapsed calendar time: approximately 20 weeks (5 months), with Phase 2 being the dominant block. Phases are NOT parallelizable without a second engineer.

---

## 3 · Phase 1 · Landing page + waitlist

### Objective

Ship `floom.dev` as a Next.js 15 App Router site on Vercel with Supabase-backed waitlist capture. All 17 wireframe screens are reachable and look beautiful. Nothing executes. Every "Run", "Deploy", "Install" button opens a waitlist modal or links to `/docs` / GitHub. This is a marketing milestone, not a product.

### What exists before this phase

Zero public presence beyond `preview.floom.dev` (the running Docker image). `floom.dev` today belongs to Vlad's version, per `project_floom_docker_vision.md`. Federico needs to reclaim `floom.dev` DNS and redirect it at the new Next.js site before this phase ships.

### Deliverables

1. Next.js 15 App Router project in `~/floom-landing/` (new repo `floomhq/floom-landing`, private until audit passes)
2. Tailwind 4 + shadcn/ui primitives + the full v8 CSS token system (colors, radii, shadows, fonts) ported 1:1 from `/tmp/floom-wireframes/v8.html`
3. Lucide icon sprite imported via `lucide-react` matching the 40-odd icons used in v8
4. All 17 routes rendered:

| Route | Source wireframe section | Interactive? |
|---|---|---|
| `/` | `#home` | No, scrolls, waitlist CTA in hero |
| `/store` | `#store` | Filter bar + search UI only, no results API |
| `/p/[slug]` | `#app` | Static for 3 hero apps (OpenPaper, FlyFast, OpenSlides) |
| `/p/[slug]/dashboard` | `#dashboard` | Read-only mock with "join waitlist to unlock" overlay |
| `/build` | `#build` | Static, "Deploy" opens waitlist modal |
| `/run` | `#run` | Static, "Run" opens waitlist modal |
| `/me` | `#me` | Static user dashboard, "Sign in" opens waitlist modal |
| `/pricing` | `#pricing` | Static, "Start free" opens waitlist modal |
| `/docs` | `#docs` | MDX-rendered, real content |
| `/auth/signup` | `#auth` | Form shell, submits to waitlist |
| `/install/[slug]` | `#install` | Static preview |
| `/publish` | `#publish` | Static preview, "Publish" opens waitlist modal |
| `/post-publish` | `#post-publish` | Static celebration mock |
| `/review/[id]` | `#review-modal` | Static modal preview |
| `/report/[id]` | `#report-modal` | Static modal preview |
| `/feedback-inbox` | `#feedback-inbox` | Static read-only |
| `/settings` | `#settings` | Static read-only |

5. Supabase project `floom-waitlist` (free tier) with one table `waitlist_entries`:
   ```sql
   create table waitlist_entries (
     id uuid primary key default gen_random_uuid(),
     email text unique not null,
     icp text check (icp in ('creator', 'user', 'developer', 'other')),
     company text,
     source text,
     utm_campaign text,
     user_agent text,
     created_at timestamptz default now()
   );
   ```
6. One Next.js server action `submitWaitlist(email, icp, company)` that upserts into Supabase, then fires a webhook to Plain (or Slack) so Federico sees it in real time
7. Vercel analytics + PostHog (both free tier)
8. OG image generation via `@vercel/og` for every page (dynamic Twitter/X / LinkedIn cards)
9. `robots.txt` + `sitemap.xml` + full JSON-LD Organization + Product schema
10. DNS cutover: `floom.dev` → Vercel, `preview.floom.dev` kept pointing at the Docker image

### Tech stack decisions

| Choice | Alternative weighed | Why |
|---|---|---|
| **Next.js 15 App Router** | Astro, Remix, pure HTML | Federico is fastest in Next.js, shadcn works out of the box, Vercel analytics is one click, SSG for the 17 static routes keeps the site under 100ms TTFB globally |
| **Tailwind 4** | Tailwind 3 | v4 is GA, simpler config, faster builds, Lightning CSS. v8 wireframe uses Tailwind CDN which transfers cleanly to v4 |
| **shadcn/ui** | Radix bare, HeadlessUI, DaisyUI | shadcn gives ownership of the components (copy-paste not dependency), perfect for a single polished marketing site where we'll customize aggressively |
| **Supabase (waitlist only)** | Airtable, Google Sheets, Plain directly | Federico already uses Supabase, free tier is generous, SQL is better than a spreadsheet for segmenting waitlist by ICP later, and we can reuse the project for Phase 3 if we pivot |
| **Vercel hosting** | Cloudflare Pages, Netlify, self-host on Hetzner | Preview deploys per PR are free, edge routing is fast, Federico has the CLI workflow memorized, and the analytics integration is zero config |
| **Plain.com** for waitlist notifications | Slack, Discord webhook | Plain doubles as the support inbox in Phase 5, so setting it up here saves re-wiring later |

### Week-by-week build steps

Realistic estimate for one solo founder at 25 focused hours per week: **12 calendar days of work, spread across 2.5 weeks** to allow for context-switching and founder overhead.

**Day 1 — Scaffold and token port**
```bash
cd ~
npx create-next-app@latest floom-landing --ts --tailwind --app --eslint --src-dir
cd floom-landing
pnpm add lucide-react clsx tailwind-merge @supabase/supabase-js @vercel/og posthog-js
pnpm dlx shadcn@latest init
pnpm dlx shadcn@latest add button card dialog input textarea label badge
gh repo create floomhq/floom-landing --private --source=. --remote=origin --push
```
- Port `:root` CSS variables from v8.html into `src/app/globals.css`
- Port Lucide SVG sprite class rules (`svg:not([class])`, `.icon`, `.icon-tile`)
- Import Inter, DM Serif Display, JetBrains Mono via `next/font`

**Day 2 — Home page (`/`)**
- Port `#home` section (ASCII hero, layers grid, featured apps)
- Replace three featured-app placeholders with real OpenPaper / FlyFast / OpenSlides art (use existing product screenshots from `~/Downloads/openpaper-upstream/public/`, `~/opensky-app/public/`)
- Hero CTAs: "Deploy an app" (creator) + "Browse the store" (user) side-by-side, equal weight
- Waitlist modal component (shadcn Dialog)

**Day 3 — Store (`/store`) + App detail (`/p/[slug]`)**
- Port `#store` grid with 15 seeded apps (same as `db/seed.json` in floom-monorepo)
- Category filters, search input (client-side only, no backend)
- `/p/[slug]` as the user-facing product page: hero, screenshots, reviews, Install CTA. The creator dashboard lives at `/p/[slug]/dashboard` per positioning memo
- Generate dynamic OG images per app

**Day 4 — Creator dashboard (`/p/[slug]/dashboard`) + user dashboard (`/me`)**
- Port `#dashboard` with all 13 sections (Auth, Access, Logs, Usage, Versions, Schedule, Webhooks, App memory, Vault, Domains, Feedback, Reviews, Settings) as read-only static mocks
- Full-width "Join waitlist to unlock" overlay with email capture
- Port `#me` (user dashboard: installed apps, recent runs, schedules, folders, shared, connected tools)

**Day 5 — Pricing, docs, auth stubs**
- Port `#pricing` (OSS free, Cloud Free, Cloud Pro $29/mo, Cloud Team $99/mo, Enterprise)
- `/docs` using Contentlayer or `next-mdx-remote` (decision: `next-mdx-remote`, it's simpler). Start with: Quickstart, Self-host, Protocol spec, CLI reference, MCP integration
- `/auth/signup` and `/auth/login` form shells, submit to waitlist

**Day 6 — Build, run, install, publish, post-publish, feedback-inbox, settings**
- Port the remaining 7 screens as static mocks with waitlist CTAs where needed

**Day 7 — Waitlist backend + Supabase wiring**
- Create Supabase project `floom-waitlist`, set env vars (confirmed with Federico per env-var rule)
- Write `app/actions/waitlist.ts` server action
- Wire every "Run", "Deploy", "Install", "Sign up" button to open the modal
- Plain webhook integration so new signups ping Federico

**Day 8 — SEO, analytics, OG**
- `sitemap.xml`, `robots.txt`, `manifest.json`, favicon set, Apple touch icons
- `@vercel/og` dynamic OG image per route
- JSON-LD Organization + SoftwareApplication schema per `/p/[slug]`
- PostHog + Vercel Analytics init
- Privacy policy + terms (use `terms.floom.dev` template)

**Day 9 — Mobile + empty/loading/error states**
- Mobile hamburger nav (port from `mobile-hamburger-open.png`)
- Verify every screen at 375px, 414px, 768px, 1440px
- 404 page, error boundary, loading skeletons

**Day 10 — Deploy + DNS cutover**
```bash
vercel link
vercel env add NEXT_PUBLIC_SUPABASE_URL production
vercel env add NEXT_PUBLIC_SUPABASE_ANON_KEY production
vercel env add SUPABASE_SERVICE_ROLE_KEY production
vercel env add PLAIN_WEBHOOK_URL production
vercel --prod
```
- IONOS DNS: point `floom.dev` + `www.floom.dev` at Vercel (ALIAS + CNAME)
- Preserve `preview.floom.dev` pointing at AX41 Docker instance
- Verify HTTPS, www redirect, security headers (`next.config.js` headers)

**Day 11-12 — Polish + launch prep**
- Lighthouse pass on all 17 routes, fix any score under 90
- Shoot 3-minute demo video (Screen Studio or Loom, narrated walkthrough)
- Draft Show HN post, LinkedIn post, Twitter thread
- Shortlist 20 press contacts (TechCrunch AI beat, Stratechery, Latent Space, Lenny's, etc.)
- Share with 5 friendly reviewers (Gourav, Jannik, Yash, 2 others from f.inc network) for feedback before public launch

### Testing protocol

**Technical**

| Check | Target | Tool |
|---|---|---|
| Lighthouse Performance | >= 90 | `vercel inspect --lighthouse` |
| Lighthouse Accessibility | >= 95 | same |
| Lighthouse Best Practices | >= 95 | same |
| Lighthouse SEO | 100 | same |
| Vercel Analytics page views | all 17 routes registering | Vercel dashboard |
| Mobile viewports | 375, 414, 768, 1440, 1920 | Chrome DevTools device mode + Playwright screenshot |
| No console errors | 0 errors on every route | Chrome DevTools console |
| No 4xx / 5xx | `vercel logs --since 1h` clean | Vercel CLI |
| Heading hierarchy | single `h1` per page, no skipped levels | `axe-core` |
| Link integrity | 0 broken links | `lychee` CLI |

**Product QA — Maria walk** (biz user, the user-side ICP)

1. Lands on `/`, reads the hero: does she understand what Floom is in under 10 seconds?
2. Clicks "Browse the store", scans 15 apps. Does any app pull her in?
3. Clicks OpenPaper, lands on `/p/openpaper`: screenshots, reviews, Install CTA. Does she trust it?
4. Clicks "Install", gets waitlist modal. Does she leave or fill it in?
5. Tries to go back, clicks `/docs`. Can she find the self-host quickstart?
6. Clicks through to GitHub. Does she see social proof (stars, commits, contributors)?

**Product QA — Jannik walk** (vibecoder creator)

1. Lands on `/`, scans the hero. Does the creator CTA feel equal to the user CTA?
2. Clicks "Deploy an app", lands on `/build`. Does he immediately see the three deploy paths (CLI / npx / Docker)?
3. Copies the `npx @floom/cli install openpaper` line, tries it in his terminal. (For Phase 1, this ties into the waitlist since the CLI is not ready.)
4. Clicks "What I get", scrolls the 13 production-layer features. Is the promise clear?
5. Scrolls to pricing. Does Cloud Free → Cloud Pro → Cloud Team make intuitive sense?
6. Hits the waitlist modal, fills creator flow, confirms email.

**Product QA — Yash walk** (developer via MCP)

1. Lands on `/`, scrolls to "Developer" or "MCP" section (must exist).
2. Clicks `/docs/mcp`, sees MCP integration snippet for Claude Desktop.
3. Clones `https://github.com/floomhq/floom-monorepo`, runs `docker run`. Can he get a working MCP endpoint in 60 seconds?
4. Stars the repo.

**Unhappy paths**

| Scenario | Expected behavior |
|---|---|
| Empty waitlist email submit | Inline error, no submission, accessibility-announced |
| Duplicate email submit | "You're already on the list, thanks!" (upsert + idempotent) |
| Slow 3G load | Skeleton renders in < 2s, content in < 6s |
| iOS Safari mobile keyboard covering form | Form auto-scrolls into view |
| Ad blocker (uBlock) enabled | Site works, PostHog silently fails |
| JavaScript disabled | Core content still renders (Next.js SSG handles this), forms gracefully degrade to `<form action>` posting to a fallback endpoint |
| Bot spam on waitlist | Rate limit via Vercel middleware (10 submissions/hour/IP) + Cloudflare Turnstile |

### ICP walkthrough

**Maria (biz user, ops at a 30-person SaaS)**: After this phase, she can read the pitch, browse the store, click into apps, and join the waitlist. What's broken: she can't actually try any app. What's missing: a "here's a 2-minute demo video" embed on the hero. What delights her: the clean Notion-like aesthetic and the "production-safe · powered by Floom" badge on every app card, which makes random community apps feel trustworthy.

**Jannik (vibecoder, Cursor + Claude Code, built something last week)**: After this phase, he understands the pitch, sees the 13 production-layer guarantees (auth, access, logs, etc.), and joins the waitlist. What's broken: `@floom/cli` is a stub, he can't actually deploy yet. What's missing: a working playground where he can paste an OpenAPI URL and see it wrapped live. What delights him: the `/build` page showing three deploy paths side by side (CLI, npx, Docker), matching how he already thinks.

**Yash (developer via MCP, wiring Floom into Cursor)**: After this phase, he can follow the `/docs/mcp` quickstart, clone the repo, and run the existing Docker image for real (it already works today via `preview.floom.dev`). What's broken: nothing for him, because the MCP endpoint ships at v0.1.0. What's missing: richer examples showing how to integrate floom into Claude Desktop / Cline / Cursor. What delights him: the open-core promise is real and he can self-host immediately.

### Success criteria

- `floom.dev` loads in under 1s globally (Vercel edge)
- Waitlist collects > 100 emails in the first 7 days of public visibility
- All 17 screens reachable, no broken links, Lighthouse green on every route
- Gourav, Jannik, and one f.inc cohort member each confirm "yes I'd sign up" after reviewing
- Show HN draft ready, video ready, 3 press contacts replied

### Time estimate

**2.5 calendar weeks, 60-70 focused hours**. This assumes Federico does not touch Phase 2 code yet. If he gets distracted by the Docker work, this slips to 4 weeks.

### Key risks for this phase

1. **DNS / `floom.dev` ownership**: Vlad's version sits on this domain. Federico must reclaim it or accept launching on `app.floom.dev` or a new domain. Decision needed before Day 10. **Impact: High, likelihood: Medium**.
2. **Wireframe fidelity**: v8.html uses Tailwind CDN, not a real build. Some utilities won't JIT-compile in a Tailwind 4 project. Mitigation: port tokens manually into globals.css rather than `@tailwind`'ing the CDN output.
3. **Mobile polish**: v8 wireframes have a mobile version but edge cases (keyboard handling, safe-area-insets) always bite. Budget extra day 9.
4. **Gourav / Jannik feedback scope creep**: they'll want real functionality, not mocks. Say no and keep this phase marketing-only.

---

## 4 · Phase 2 · Docker OSS core

### Objective

Extend the existing `floom-monorepo` v0.1.0 into v0.2.0 where the full production layer (the 13 dashboard sections in v8.html) is real and self-hostable. `docker run ghcr.io/floomhq/floom-monorepo:v0.2.0` boots everything. A non-Federico person can deploy it on their own Hetzner box and ship an app end-to-end.

This is the phase that makes the open-core promise real.

### What exists before this phase

Everything in Section 1 (runtime, ingest pipeline, proxied runner, MCP handshake, 4 surfaces, 15 seeded apps, SQLite with `apps` / `runs` / `secrets` / `chat_threads` / `chat_turns` / `embeddings`, dockerode-based runner, Dockerfile shipping to ghcr).

### Deliverables (the production layer)

| # | Feature | Backend work | Frontend work | DB schema |
|---|---|---|---|---|
| 1 | **Auth** (local + OIDC) | `/api/auth/*` routes via Better Auth | `/signin`, `/signup`, `/auth/callback/:provider` | `users`, `sessions`, `accounts`, `verification_tokens` (all from Better Auth schema) |
| 2 | **Per-app access control** (visibility, allowed callers, per-tool permissions) | `/api/apps/:slug/access` + middleware on `/api/:slug/run` and `/mcp` | `#dashboard` "Who can use it" section | `app_access_policies`, `app_allowed_callers`, `app_tool_permissions`, `api_keys` |
| 3 | **App-level identity providers** (creator configures sign-in for their app) | `/api/apps/:slug/auth-config` | Per-app auth settings section | `app_auth_configs` |
| 4 | **Activity logs** | Runs table already exists, add caller/user columns + filtering | `#dashboard` Activity view with filters | Alter `runs` table: add `caller_id`, `caller_type`, `api_key_id`, `ip`, `user_agent` |
| 5 | **Usage metrics** | `/api/apps/:slug/usage?from=&to=` aggregation | Charts in dashboard (recharts or tremor) | View on top of `runs` table |
| 6 | **Versions** (git-backed) | `/api/apps/:slug/versions` + `git` checkout | Timeline + diff viewer | `app_versions` (git_sha, created_at, deployed_by, diff_summary) |
| 7 | **Schedules** (cron) | `/api/apps/:slug/schedules` + `node-cron` worker | Cron UI with next-run preview | `app_schedules` (cron_expr, inputs_json, enabled, last_run_at, next_run_at) |
| 8 | **Webhooks** | `/api/apps/:slug/webhooks` + outbound delivery worker with retry | Webhook config + delivery log | `app_webhooks`, `webhook_deliveries` |
| 9 | **App memory** (per-user state per app) | `/api/apps/:slug/memory/:user_id` | Dashboard view of memory tables | `app_memory` (app_id, user_id, key, value_json) |
| 10 | **Secrets vault** (scoped per-app) | Existing `secrets` table, add UI + audit | `#dashboard` Vault section | Alter `secrets`: add `description`, `last_accessed_at`, `created_by` |
| 11 | **Custom domain** (BYO DNS) | `/api/apps/:slug/domain` + verify via DNS TXT record | Settings domain card | `app_domains` (domain, verification_token, verified_at, ssl_status) |
| 12 | **Feedback inbox** | `/api/apps/:slug/feedback` + POST ticket | `#feedback-inbox` + `#report-modal` | `feedback_tickets`, `feedback_replies` |
| 13 | **Reviews** | `/api/apps/:slug/reviews` | `#review-modal` + store rating aggregation | `app_reviews` (rating, text, user_id, moderation_state) |
| 14 | **Empty / loading / error states** | — | Every screen, every list, every form | — |

### Tech stack decisions

| Choice | Alternative weighed | Why |
|---|---|---|
| **Better Auth** | Auth.js v5, Lucia, Clerk, WorkOS | Auth.js merged into Better Auth as of 2026, Lucia deprecated to "educational resource only," Clerk is hosted-only (kills self-host promise), WorkOS is enterprise-priced. Better Auth supports SQLite + Postgres + Drizzle, 40+ social providers, magic link, passkeys, 2FA, SSO/SAML as plugins, Hono adapter, Next.js adapter. It's the only option that fits OSS self-host + cloud without rewriting. |
| **Drizzle ORM** | Prisma, Kysely, raw better-sqlite3 | Drizzle has the best Hono + Better Auth + SQLite + Postgres story, schema is TypeScript not a DSL, migrations are readable SQL, and Federico already uses it in Rocketlist. Prisma is heavier and requires a separate client generation step that bloats the Docker image. |
| **node-cron** (for Schedules) | BullMQ + Redis, Temporal, Inngest | Floom OSS ships as a single Docker image — adding Redis or a workflow engine breaks the "docker run and it works" promise. node-cron runs in-process, stores cron exprs in SQLite, fires runs via the existing runner. For cloud at Phase 3, Inngest becomes an upgrade path. |
| **Nodemailer + Resend** (emails) | SendGrid, Postmark, self-host SMTP | Resend has free tier (100/day), great DX, and Nodemailer lets self-hosters BYO SMTP for airgapped installs. |
| **Zod** (validation) | Yup, Valibot, io-ts | Already in the repo, Hono uses it natively, Better Auth uses it, Drizzle-zod bridges schema to validators. |
| **Vitest** (tests) | Jest, Node test runner | Fastest, pnpm-native, Federico already uses it, plays nicely with TypeScript ESM in the monorepo |
| **Playwright** (E2E) | Cypress | Already standard in Federico's stack, supports Hono SSR + Vite SPA + Docker runner tests out of the box |
| **Git for versions** (backend) | Custom blob store, S3 snapshots | Apps are already deployed from Git repos. Treating Git as the version store means "rollback" is literally `git checkout <sha> && rebuild docker image`. Zero new storage primitives. |
| **Drizzle-kit migrate** | Custom migration files in db.ts | The current `db.ts` uses inline `db.exec()` with "if column doesn't exist" migration pattern. This does not scale past 13 tables. Move to drizzle-kit, generate migrations from schema, version in git. |

### Week-by-week build steps

**Realistic estimate: 8 calendar weeks, 160-200 focused hours**. This is the dominant chunk of the whole roadmap. If it slips, everything slips.

**Week 1 — Auth foundation (Better Auth)**

Day 1: Install Better Auth, port `db.ts` to Drizzle, generate initial schema, add `users`/`sessions`/`accounts`/`verification_tokens` tables.

```bash
cd /root/floom-monorepo/apps/server
pnpm add better-auth drizzle-orm drizzle-zod
pnpm add -D drizzle-kit
# generate initial drizzle schema from existing SQLite
pnpm drizzle-kit introspect:sqlite --out ./drizzle --config drizzle.config.ts
```

Day 2: Wire `@hono/auth` middleware, implement email+password signup, magic link, Google OAuth, GitHub OAuth.

Day 3: Frontend auth pages in `apps/web`: `/signin`, `/signup`, `/auth/callback`. Use the Floom design tokens (already in wireframe).

Day 4: Protect all `/api/apps/*/write` routes behind auth. `/api/hub` (browse) stays public.

Day 5: Integration tests for auth flow, Docker image rebuilds and boots with the new schema.

**Week 2 — Access control + API keys**

Day 6: Schema: `api_keys`, `app_access_policies`, `app_allowed_callers`, `app_tool_permissions`. Drizzle migration.

Day 7: `/api/apps/:slug/access` CRUD. Middleware that resolves a caller from session OR api_key header on `/api/:slug/run` and `/mcp/app/:slug`.

Day 8: Rate limiter middleware (token bucket, per-app, per-caller). Store buckets in-memory for OSS, Redis later in Cloud.

Day 9: Frontend: "Who can use it" section of dashboard. Visibility picker, allowed callers table, per-tool permission matrix, API key generator with one-time reveal.

Day 10: Tests: public app is callable by anyone, invite-only blocks non-allowed callers, per-tool permission matrix enforced on `/mcp/app/:slug/tools/:tool`.

**Week 3 — Activity + Usage**

Day 11: Alter `runs` table (caller_id, caller_type, api_key_id, ip, user_agent). Backfill existing rows with `caller_type='anonymous'`.

Day 12: `/api/apps/:slug/activity?from=&to=&caller=&status=` with pagination.

Day 13: `/api/apps/:slug/usage` aggregating runs per day, avg duration, success rate, top callers. Response is JSON ready for charts.

Day 14: Frontend: Activity view with filters + search + CSV export. Usage view with recharts line+bar.

Day 15: Tests: 10k synthetic runs, usage query under 200ms p95, activity paginate works, CSV export under 1s for 1000 rows.

**Week 4 — Versions + rollback**

Day 16: `app_versions` table + capture git SHA on deploy + diff summary between consecutive versions.

Day 17: `/api/apps/:slug/versions` listing + `/api/apps/:slug/versions/:sha/rollback`. Rollback shells out to `git checkout` in the app's code_path, rebuilds Docker image, restarts runner.

Day 18: Frontend: timeline component with deploy events, click to expand diff viewer (react-diff-view or shiki-diff).

Day 19: Tests: rollback to previous version works, rollback to a version that's been git-gc'd fails gracefully, concurrent deploy+rollback is serialized.

Day 20: Safety: rollback button is behind a `confirm` dialog showing what will change.

**Week 5 — Schedules + webhooks**

Day 21: `app_schedules` table. `node-cron` worker spawned in `apps/server/src/index.ts boot()` function. Load all enabled schedules on boot.

Day 22: `/api/apps/:slug/schedules` CRUD. UI: cron expression input with human-readable preview ("every day at 9am"), next-run preview.

Day 23: `app_webhooks` + `webhook_deliveries` tables. Outbound delivery worker with exponential backoff (1s, 5s, 25s, 125s, 625s = ~10min max).

Day 24: `/api/apps/:slug/webhooks` CRUD. UI: endpoint URL, events (`run.succeeded`, `run.failed`, `run.started`), secret for HMAC signing, delivery log.

Day 25: Tests: schedule fires on time (within 1s drift), webhook retries 5 times then dead-letters, HMAC signature verified on receiving end.

**Week 6 — App memory + secrets UI + custom domain**

Day 26: `app_memory` table. `/api/apps/:slug/memory/:user_id` CRUD. UI: read-only view of memory per user, editable for creator debugging.

Day 27: Secrets UI: list existing secrets (masked), create new, delete. Audit log row per access.

Day 28: Custom domain: `app_domains` table. `/api/apps/:slug/domain` + verification flow (creator adds TXT record, we poll DNS).

Day 29: Frontend domain flow: "Add domain" input, show TXT record to add, verify button, verification status indicator.

Day 30: For OSS self-host, custom domain is BYO: document how to point a CNAME at the self-hosted instance + configure Caddy/Nginx. No automated SSL in OSS (that's a Phase 3 cloud feature).

**Week 7 — Feedback inbox + reviews**

Day 31: `feedback_tickets` + `feedback_replies` tables. POST `/api/apps/:slug/feedback` from users (captures their identity + context: input hash, run_id, error trace if any).

Day 32: `/api/apps/:slug/feedback` GET for creators, with filters (category, status, urgency).

Day 33: Frontend feedback inbox matching `#feedback-inbox` wireframe. Ticket detail view, reply form, mark resolved.

Day 34: `app_reviews` table. Post-run "rate this app" prompt. Reviews feed into `/p/:slug` star rating aggregation.

Day 35: Email notifications on new ticket + on reply (via Resend in cloud mode, Nodemailer in OSS mode).

**Week 8 — Empty/loading/error states + Docker rebuild + self-host testing**

Day 36-37: Every screen audited for 3 states: empty (no data yet), loading (skeleton), error (API failure). Port the `#empty-states` wireframe section faithfully.

Day 38: Rebuild Docker image `v0.2.0`, verify boot on clean Hetzner box, run end-to-end flow: deploy one app, invite one collaborator, create schedule, attach webhook, rollback.

Day 39: Self-host dogfooding: Federico deploys a fresh instance on a Hetzner CX22 (4€/month VPS) and uses it to deploy OpenPaper. Tests the full "someone who is not me" path.

Day 40: Documentation: `docs/SELF_HOST.md` rewrite with the new feature matrix, `docs/UPGRADING.md` for v0.1.0 → v0.2.0 migration (it's a fresh schema, so fresh install recommended).

### Testing protocol

**Technical (Vitest + Playwright + Docker CI)**

| Layer | Coverage | Tool |
|---|---|---|
| Unit tests | 80% line coverage on `services/*.ts`, 100% on middleware | Vitest |
| Integration tests | every API endpoint (hub, parse, pick, thread, run, mcp, apps, auth, feedback, reviews, schedules, webhooks, domains, versions, memory, access) | Vitest + supertest |
| E2E tests | full flows: sign up → deploy app → invite collaborator → schedule run → see usage → rollback | Playwright |
| Docker image tests | `docker run` on Ubuntu 22.04, Debian 12, Alpine 3.20, verify boot and `curl /api/health` | GitHub Actions matrix |
| Migration tests | v0.1.0 → v0.2.0 upgrade preserves existing apps and runs | Vitest + fixture DB |
| MCP integration test | Claude Desktop connects to `/mcp/app/openpaper`, lists tools, calls one | Scripted via Claude Desktop config + assertion |
| Load test | 100 concurrent runs, 1000 schedules, 10k webhooks/day | `autocannon` + `oha` |

**Product QA — Maria walk (as user)**

1. Signs up on a fresh self-host with email + password
2. Browses the store, finds OpenPaper
3. Installs it (single click, now works)
4. Runs OpenPaper via Chat UI, gets a result
5. Rates the app 5 stars, writes a short review
6. Opens her `/me` dashboard, sees the run in history
7. Schedules OpenPaper to run every Monday morning
8. Reports a bug via the report button, gets an email confirmation

**Product QA — Jannik walk (as creator)**

1. Signs up with Google OAuth
2. Clicks "Deploy an app", points CLI at `github.com/floomhq/flyfast`
3. Watches the deploy log, sees the app appear in his dashboard
4. Opens `/p/flyfast/dashboard`, sees 13 sections populated
5. Configures "Who can use it": public, anyone, rate limit 100/hour
6. Adds API key for his Cursor integration
7. Creates a schedule: run every Monday at 9am with a set of 3 routes
8. Adds a webhook to a Slack Incoming Webhook URL (receives `run.succeeded`)
9. Runs the app once, sees the activity log, sees the usage chart update
10. Pushes a new version of FlyFast, sees it appear in the Versions timeline
11. Rolls back to previous version after noticing a regression
12. Opens Feedback inbox, finds Maria's bug report, replies to her, marks resolved
13. Sees a new review on his app, responds to it

**Product QA — Yash walk (developer via MCP)**

1. Self-hosts Floom via `docker run` on his own box
2. Configures Claude Desktop to point at `http://localhost:3051/mcp/app/flyfast`
3. Asks Claude "find me a cheap flight to Tokyo next week"
4. Claude discovers the tools via MCP handshake, calls `search_flights`, returns results
5. Checks the activity log, sees his call logged as `caller_type='mcp_client'`
6. Generates an API key scoped to `search_flights` only
7. Tests that `list_airports` is blocked with 403

**Unhappy paths**

| Scenario | Expected behavior | Test |
|---|---|---|
| Postgres / SQLite DB down | Health check fails, `/api/health` returns 503, no requests served | kill SQLite file, check health endpoint |
| Disk full | Runs return 507, dashboard shows warning banner | `dd` to fill disk, verify behavior |
| Malformed OpenAPI spec on boot | App flagged as `status='invalid'` in DB, other apps boot fine | ingest a broken YAML |
| Runtime crash mid-run | Run marked `status='crashed'` with stack trace in `error_type` field | kill docker container mid-run |
| Schedule hits a failing app | First failure fires webhook, second failure pages creator, third failure disables schedule | force 3 consecutive failures |
| Webhook endpoint returns 500 | Retry 5 times with backoff, then dead-letter | mock webhook endpoint returning 500 |
| API key revoked mid-call | In-progress call completes, next call 401 | revoke key during run |
| Session expired | Redirect to /signin with return URL | mock expired session |
| Rate limit exceeded | 429 with Retry-After header | hammer an endpoint |
| Custom domain DNS misconfigured | Verification fails after 5 polls, shows clear error + troubleshooting link | misconfigure DNS |
| Binary file output from runner | Served with correct content-type, download link generated | run an app that outputs PDF |
| Large JSON (100MB) from runner | Streams response, no memory blow-up | run an app that returns massive JSON |
| UTF-8 in inputs (emojis, arabic) | Passes through correctly, rendered correctly in activity log | run with `"input": "سلام 🌍"` |
| 0 apps installed state | Empty dashboard with "Install your first app" CTA | fresh install |
| 1000 apps installed | Dashboard paginates, search is fast | seed 1000 apps |
| Concurrent runs of same app | Each gets its own container, no state bleeding | fire 20 concurrent runs |

**Security**

| Attack surface | Test |
|---|---|
| SQL injection | Drizzle parameterizes everything; also run `sqlmap` against `/api/hub?q=` |
| XSS in reviews / feedback / app descriptions | Render via DOMPurify, Playwright test injects `<script>` |
| CSRF on mutation endpoints | Better Auth sets SameSite cookies, assertion test |
| Auth bypass | Manual review of every middleware; `zap-baseline` scan |
| Privilege escalation | User A cannot read User B's app memory, tested via integration |
| Secret leakage in logs | Grep logs for secrets patterns, regex redaction in log-stream.ts |
| Webhook HMAC timing attack | Use `crypto.timingSafeEqual` for verification |
| OAuth state parameter | Better Auth handles this; still test with tampered state |
| Docker escape | Use `docker run --security-opt=no-new-privileges` and seccomp profile |

**Performance**

| Metric | Target |
|---|---|
| `/api/hub` response time | p95 < 100ms |
| `/api/:slug/run` dispatch | p95 < 500ms (not including runner time) |
| `/mcp/app/:slug/tools/list` | p95 < 100ms |
| Docker image size | < 500MB compressed |
| Cold boot (docker start → /api/health 200) | < 10s |
| Memory footprint idle | < 400MB RSS |
| 100 concurrent runs | no 5xx, p95 latency under 2s |

**Chaos**

Run 3 chaos scenarios before declaring Phase 2 done:

1. Kill the runner container mid-run, verify a new container spins up and the run is marked `crashed` with a useful error
2. Fill the disk to 95%, verify new runs are rejected with a clear error, existing runs don't corrupt the DB
3. DNS failure to `ghcr.io`, verify boot still works from cached image

### ICP walkthrough

**Maria**: After Phase 2, on a self-hosted instance she can sign up, browse, install, run, rate, schedule, report bugs, and see her run history. She trusts it because every app has the production layer. What's broken: she still needs someone to self-host it for her (she won't do it herself). What's missing: the cloud version, so she doesn't need to beg a sysadmin. What delights her: `/me` dashboard is Notion-clean, schedules just work, reporting a bug feels like Intercom.

**Jannik**: After Phase 2, he can fully ship an app to production on his own infra. Auth, access, rate limits, logs, schedules, webhooks, versions, rollback, secrets, feedback inbox — all real. He stops worrying about "what if this blows up" because Floom wraps the safety net around his vibe-coded code. What's broken: custom domains need him to configure Caddy manually. What's missing: managed SSL (Phase 3). What delights him: the rollback button. When his Monday-morning deploy breaks he can rollback in 30 seconds.

**Yash**: After Phase 2, the MCP integration is production-grade. API keys scoped per-tool, rate limits, audit log per key. He can expose Floom apps to his entire team via one MCP endpoint. What's broken: nothing major. What's missing: Composio OAuth (Phase 4) so he can connect his team's Gmail and Notion to Floom apps without a hand-built OAuth flow. What delights him: the per-tool permission matrix (he wants `search_flights` exposed but `charge_card` blocked).

### Success criteria

- `docker run ghcr.io/floomhq/floom-monorepo:v0.2.0` boots the full stack on a clean Ubuntu 22.04 box in under 30 seconds
- All Vitest unit + integration tests green; Playwright E2E green
- One non-Federico person (candidate: Jannik, Gourav, or a f.inc engineer) successfully self-hosts and deploys one of their own apps end-to-end without help
- The 13 dashboard sections from the v8 wireframe are real and populated with real data
- `docs/SELF_HOST.md` walks someone through fresh install in 10 minutes

### Time estimate

**8 calendar weeks, 200 focused hours**. Solo founder working 25 hours/week = exactly 8 weeks with zero slack. Realistic: 9-10 weeks if any week loses to meetings or illness. This is the biggest risk on the entire roadmap.

### Key risks for this phase

1. **Auth library instability**: Better Auth merged Auth.js, API could shift. Pin the version, read the changelog weekly. **Impact: High, likelihood: Low**.
2. **Schema migration complexity**: Moving from inline `db.exec` to Drizzle-kit is a refactor in itself. Budget 2 days just for that. **Impact: Medium, likelihood: High**.
3. **Docker image bloat**: Adding Better Auth + Drizzle + Resend + node-cron adds ~40MB. Target stays under 500MB. **Impact: Low, likelihood: Medium**.
4. **Custom domain SSL in OSS**: Users will demand it. Decision: OSS docs show how to use Caddy's automatic HTTPS; don't build our own. **Impact: Medium, likelihood: High**.
5. **Scope creep from Jannik / Gourav feedback**: They will want billing, SSO, SAML, etc. Say "Phase 3 cloud only" and hold the line.
6. **Federico context-switching to other projects**: OpenPaper, FlyFast, Rocketlist, SignalDash, bulk.run all compete for attention. Phase 2 is the single biggest block of focused work. Protect it with a `WORKPLAN-20260501-floom-phase2.md` and refuse other work.

---

## 5 · Phase 3 · Cloud multi-tenant wrap

### Objective

Wrap the same Docker core with cloud multi-tenancy: workspaces, users, SSO, immutable audit logs, managed Postgres, managed custom domains, Stripe partner app billing, log streaming, advanced analytics. This is `floom.dev` as a real platform.

### What exists before this phase

All of Phase 2 (the full Docker OSS core with auth, access, logs, versions, schedules, webhooks, memory, vault, domain BYO, feedback, reviews).

### Deliverables

1. **Workspaces / orgs**: multi-user containers, roles (admin / editor / viewer), invite flow, transfer ownership
2. **SSO**: SAML via Better Auth SAML plugin, start with Google Workspace, add Okta + Microsoft Entra later
3. **Immutable audit log**: append-only, exportable as CSV/JSON, retention 12 months for Cloud Pro, unlimited for Enterprise
4. **Managed custom domains**: wildcard SSL via Cloudflare for SaaS, creator points CNAME at `routes.floom.dev`, cert auto-provisioned via Cloudflare SSL-for-SaaS API
5. **Creator onboarding**: first-time creator flow (post-signup), empty dashboard state with "Deploy your first app" CTA + video
6. **User onboarding**: first app install flow, welcome tour, connected tools setup prompt
7. **Stripe partner app billing**: the Stripe app in the store handles "charge users for your app", Floom does not handle billing directly (per positioning memo), but Floom Cloud itself bills creators for platform usage
8. **Advanced analytics**: cohorts (new user → returning user → power user), funnels (install → first run → 10th run), CSV export
9. **Log streaming**: outbound destinations (Datadog, S3, Splunk, plain webhook), configured per workspace
10. **External secrets**: Vault and AWS KMS integration via per-workspace credentials
11. **Managed Postgres**: Cloud instances use Neon or Supabase Postgres, not SQLite. Migration path from SQLite self-host → Postgres cloud
12. **Rate limiting + fair-use enforcement**: token bucket in Redis (Upstash), per-workspace quotas, overage billing via Stripe

### Tech stack decisions

| Choice | Alternative weighed | Why |
|---|---|---|
| **Vercel** (frontend) | Cloudflare Pages, Netlify, self-host | Already decided in Phase 1, Federico's fastest path |
| **GCP Cloud Run** (backend) | AWS Lambda, Fly.io, Kubernetes | Matches the existing OpenPaper backend pattern (same GCP project `core-planet-486109-r5`), Federico has auth workflow figured out (gcloud token from Mac), Cloud Run scales to zero, HTTPS + custom domains free |
| **Neon** (Postgres) | Supabase, RDS, Crunchy, self-host | Branching support for preview envs is killer, free tier 0.5GB, serverless scales to zero, compatible with Drizzle. Supabase is equally fine but Neon's branching aligns with per-PR preview deploys |
| **Cloudflare SSL-for-SaaS** (managed domains) | Let's Encrypt + Caddy, AWS ACM | Cloudflare handles the DNS + cert + edge routing in one API. Creators point `CNAME app.their-domain.com → routes.floom.dev`, we provision the cert automatically. Costs $0.05/cert/month, trivial |
| **Upstash Redis** (rate limiting, session cache) | Redis Cloud, ElastiCache | Serverless, REST API, free tier 10k commands/day, no connection pool hell on Cloud Run |
| **Stripe** (billing) | Paddle, LemonSqueezy | Federico has Stripe working in Rocketlist and OpenPaper; muscle memory matters |
| **Inngest** (background jobs, scheduled runs in Cloud) | BullMQ, Trigger.dev, Temporal | For Cloud, schedules and webhooks need a durable runner that survives Cloud Run container cycling. Inngest is the most Next.js-native, has a generous free tier, and wraps around our existing SQLite-backed cron for OSS |
| **PostHog** (product analytics) | Mixpanel, Amplitude | Already decided in Phase 1 |
| **Sentry** (error tracking) | Rollbar, Bugsnag | Industry default, Next.js integration one-click |
| **Plain** (support inbox) | Intercom, Zendesk, HelpScout | Developer-friendly, free tier, API-first, and Federico already plans to use it for the waitlist |

### Week-by-week build steps

**Realistic estimate: 6 calendar weeks, 150 focused hours**.

**Week 1 — Workspaces + roles + invites**

- Schema: `workspaces`, `workspace_members`, `workspace_invites`, `workspace_audit_log`
- `/api/workspaces/*` CRUD + `/api/workspaces/:id/invite` with tokenized email links
- Frontend: workspace switcher in nav, settings → members page, invite flow
- Migration: self-host single-user mode → cloud multi-workspace mode (existing users become workspace admins)

**Week 2 — SSO (Google Workspace first)**

- Better Auth SAML plugin, Google Workspace as first provider
- Workspace admin configures SAML metadata URL + certificate
- Sign-in flow: email domain detection → SSO redirect
- Test with a real Google Workspace domain (floom.dev itself)

**Week 3 — Managed custom domains + audit log**

- Cloudflare SSL-for-SaaS API integration
- `/api/workspaces/:id/domains` with verification + provisioning
- Cert auto-renewal via Cloudflare
- Immutable `workspace_audit_log` table, append-only (no UPDATE, no DELETE — enforced at the DB level)
- `/api/workspaces/:id/audit?from=&to=&actor=&action=` with CSV export

**Week 4 — Advanced analytics + log streaming**

- Cohort analysis (new → returning → power user) via SQL materialized views refreshed hourly
- Funnel builder (install → first run → 10th run) via a simple event table
- Log streaming: per-workspace destinations (Datadog HTTP endpoint, S3 bucket, Splunk HEC, generic webhook), delivered via Inngest background jobs
- Frontend: Analytics view with cohort + funnel tabs, CSV export

**Week 5 — Stripe billing (platform + partner app)**

- Platform billing: Floom Cloud charges workspaces (Cloud Free, Pro $29/mo, Team $99/mo, Enterprise custom). Stripe subscriptions, usage-based overage for runs > included quota.
- Partner app billing: the "Stripe" app in the store lets creators charge their users per-run or per-subscription. Creator connects their own Stripe account via Stripe Connect, Floom takes a 5% platform fee.
- Frontend: `/settings/billing` page, invoice history, usage charts, upgrade flow

**Week 6 — External secrets + load testing + launch**

- External secrets plugins: Vault (HCP Vault), AWS KMS. Creator configures integration per workspace.
- Load test: simulate 100 workspaces, 1000 apps total, 10k runs/day. Assert p95 run dispatch < 500ms, no OOM.
- Security review: pay a third party (Trail of Bits or equivalent) for 2-day pentest. $5k budget.
- SOC 2 prep checklist: access controls documented, audit logs enabled by default for Enterprise, encryption at rest (Neon handles), encryption in transit (HTTPS everywhere), incident response runbook.
- Soft launch to 10 beta workspaces (friends of Federico + f.inc cohort)

### Testing protocol

**Technical**

| Check | Target |
|---|---|
| Load test | 100 concurrent users, 10k apps, 1M runs/day simulated without 5xx |
| p95 run dispatch latency | < 500ms |
| p95 `/api/hub` | < 200ms (including Postgres query) |
| Cloud Run cold start | < 3s |
| SSO round-trip | < 2s end-to-end |
| Audit log append | < 10ms |
| Custom domain provisioning | cert issued within 60s of verification |
| Uptime | 99.9% target for 30 days pre-launch |
| Security scan | pass third-party pentest with zero critical findings |
| SOC 2 prep | controls documented, 70% of Type 1 requirements met |

**Product QA — Maria walk**

1. Invited to a workspace by her Ops team lead, clicks email invite, signs in with Google Workspace SSO
2. Sees the shared FlyFast install in her workspace
3. Runs it, gets results
4. Her run appears in the workspace audit log (visible to the admin)
5. Attempts to modify FlyFast settings: blocked because she's a viewer
6. Installs a personal app in her `/me` dashboard
7. Cancels her personal account, data is deleted within 24h (GDPR)

**Product QA — Jannik walk**

1. Signs up as a creator via `floom.dev`
2. Goes through the first-time onboarding wizard (video + "Deploy your first app" CTA)
3. Connects his GitHub account, selects `jannik/tuyo` repo, clicks Deploy
4. Watches deploy logs stream live (via Inngest-backed status updates)
5. App is live at `tuyo.jannik.workspace.floom.dev`
6. Adds a custom domain `tuyo.jannik.dev`, configures the CNAME, cert provisions in 60s
7. Invites his 2 cofounders, enables SSO for his workspace
8. Sets per-tool rate limits on his app
9. Hooks up Stripe Connect, lists his app for $10/month, sees the first subscription come in
10. Views advanced analytics: 20 installs, 120 runs, $0.50 in fees, 1 churn

**Product QA — Enterprise IT admin walk**

1. Contacts sales via `/enterprise`
2. Signs a 12-month contract, onboarded to Cloud Enterprise
3. Configures SAML via Okta, enforces SSO-only login
4. Exports audit log for last 30 days as CSV
5. Configures log streaming to their Datadog instance
6. Sets up external secrets via HCP Vault
7. Runs a compliance report showing all runs, all callers, all IP addresses, all outcomes

**Unhappy paths**

| Scenario | Expected behavior |
|---|---|
| SSO misconfigured (bad cert) | Clear error message, admin can fall back to email login to fix |
| Audit log table growing too large | Partitioned by month, old partitions archived to S3 |
| Stripe webhook delivery delay | Platform handles idempotency via webhook signature + event ID dedup |
| Custom domain DNS misconfig after cert issued | Cert stays valid, traffic fails with clear error page |
| Workspace name collision | Unique constraint on slug, UI prompts for alternative |
| Seat limit hit | Block new invites, show upgrade CTA |
| Trial expired | Workspace moves to read-only for 14 days, then archived |
| Payment failed | Grace period 7 days, dunning emails, downgrade to free tier |
| Data retention policy triggered | Old runs archived to cold storage at 90 days for free, 365 days for Pro |
| User leaves workspace | Personal runs preserved, workspace runs remain for audit |
| Creator offboards | Apps paused, users notified 30 days in advance |
| Workspace deletion with active apps | 30-day soft delete, recovery flow, then hard delete |
| Workspace merger (acquisition) | Manual admin flow, documented, done via support |

**Security**

- Third-party pentest (2 days, $5k budget)
- OWASP Top 10 checklist
- Dependency audit via `pnpm audit --audit-level=high`
- Secret scanning via `gitleaks` in CI
- Row-level security on Neon for multi-tenant queries
- Encryption at rest (Neon) + in transit (TLS 1.3)

### ICP walkthrough

**Maria**: Cloud Phase 3 is where Maria becomes a real user. She can join her company's workspace without asking IT to self-host anything, SSO works, audit log reassures her security team, and she pays $0 because her company has a Cloud Pro subscription. What delights her: it feels like Notion or Linear — polished, fast, no friction. What's missing: the tools she cares about (Gmail, Google Sheets, Slack) aren't yet connectable (that's Phase 4).

**Jannik**: Cloud Phase 3 is where Jannik monetizes. He points his custom domain, lists his app on the store with a $10/month price via the Stripe partner app, sees real users installing, sees the revenue line tick up. The advanced analytics (cohorts, funnels) help him understand his users for the first time. What delights him: managed SSL and managed domains are invisible — he forgets they're a hard problem. What's missing: co-marketing support from Floom to drive traffic to his app listing.

**Enterprise IT admin**: Cloud Phase 3 is where enterprise becomes possible. SAML, audit, log streaming, data retention, external secrets. What delights her: the compliance story is real. What's missing: SOC 2 Type 2, full SAML support for Okta/Entra/OneLogin (not just Google Workspace), HIPAA BAA.

### Success criteria

- 10 beta workspaces onboarded, 3 of them paying
- 0 security incidents during beta
- 99.9% uptime for 30 days pre-launch
- Third-party pentest passed with zero critical findings
- SOC 2 Type 1 controls 70% documented

### Time estimate

**6 calendar weeks, 150 focused hours**. Can compress to 5 weeks if Federico is heads-down in SF after f.inc starts. More likely slips to 7-8 weeks because SSO + audit + billing + load testing each have surprise complexity.

### Key risks for this phase

1. **Billing complexity**: Stripe Connect + platform fee + usage-based overage = 3 weeks of work hidden inside a "1 week" bullet. **Impact: High, likelihood: High**. Mitigation: use Stripe's `Subscriptions` + `Metered Billing` primitives as-is, don't invent anything.
2. **Cloudflare SSL-for-SaaS gotchas**: cert provisioning can fail silently. Mitigation: verify every cert with `openssl s_client` in the verify endpoint.
3. **Neon branching for preview envs**: great idea but untested at scale. Mitigation: start with one dev branch, one prod branch, add preview-branch-per-PR only after Phase 3 ships.
4. **Inngest vs in-process cron**: running both means two sources of truth for schedules. Mitigation: OSS uses node-cron, Cloud uses Inngest, schemas are identical, code swaps via a driver pattern.
5. **SOC 2 prep cost and time**: if Federico pushes for Type 2 before launch, 6 months added. Mitigation: Type 1 only for v1.0, Type 2 after.

---

## 6 · Phase 4 · Composio integration

### Objective

Replace custom OAuth wiring with Composio for the "Connect a tool" on-ramp. Unlocks 1000+ integrations including Gmail, Notion, Google Sheets, Slack, Stripe, Shopify, HubSpot, Calendar, Linear, Figma, Airtable, GitHub, Sentry, Vercel.

### What exists before this phase

All of Phase 3 (Cloud multi-tenant platform). Phase 2 OSS does NOT get Composio — self-hosters bring their own OAuth credentials. Composio is a Cloud-only feature.

### Research findings on Composio (as of 2026-04-14)

- **Model**: hosted service, agent platform for tool integration + authentication + execution
- **Scale**: 1000+ toolkits, fully managed OAuth, token refresh, lifecycle management, white-labeling available
- **SDK**: TypeScript SDK supported, works with Claude Agent SDK, Anthropic SDK, Vercel AI SDK, Mastra, LangChain, LlamaIndex
- **Auth modes**: OAuth, API keys, custom auth flows; inline auth triggered by user intent
- **Pricing**:
  - Free: 20K tool calls/month, $0
  - Ridiculously Cheap: 200K calls, $29/month + $0.299 per 1K overage
  - Serious Business: 2M calls, $229/month + $0.249 per 1K overage
  - Enterprise: custom (SOC-2, dedicated SLA, VPC/on-prem)
- **Positioning**: "Just-in-time tool calls, secure delegated auth, sandboxed environments, parallel execution across 1,000+ apps"
- **Unknown**: rate limits on free tier, self-hosting availability (appears hosted-only), exact OAuth flow UX

### Deliverables

1. **Composio SDK** integrated into `apps/server/src/services/composio.ts`
2. **Creator-side UI**: "Connect a tool" section in dashboard, select from Composio's catalog, configure per-app
3. **User-side UI**: "Authorize X" flow when a user tries to use an app that needs e.g. their Gmail
4. **Token vault**: store Composio connection IDs per user per app in `user_connections` table; actual tokens live in Composio
5. **Refresh flow**: Composio handles refresh automatically; we handle expired-token error handling
6. **Revocation**: user can revoke a connection from their `/me/connections` page
7. **First 15 integrations enabled**: Gmail, Google Sheets, Google Calendar, Slack, Notion, Stripe, HubSpot, Shopify, Airtable, Linear, GitHub, Figma, Sentry, Vercel, X/Twitter

### Tech stack decisions

| Choice | Alternative weighed | Why |
|---|---|---|
| **Composio** | Nango, Pipedream Connect, Merge.dev, custom OAuth | Composio has the broadest tool catalog, TypeScript SDK, agent-native positioning. Alternatives: Nango is 100% open-source self-hostable (could replace Composio if pricing explodes); Pipedream is more workflow-oriented; Merge is B2B-focused (HRIS, CRM). Composio's free tier (20K calls/month) is enough for beta. Keep Nango as fallback if Composio changes pricing. |
| **User-side connection flow** | Deep link into Composio hosted UI | Composio's hosted auth UI is polished and handles edge cases (consent screens, scopes). White-labeling is available on Enterprise but not needed for v1.0 |

### Week-by-week build steps

**Realistic estimate: 2 calendar weeks, 40 focused hours**.

**Week 1 — SDK integration + 3 pilot tools**

Day 1: Sign up for Composio, read docs, get API key, integrate SDK
```bash
cd /root/floom-monorepo/apps/server
pnpm add composio-core @composio/vercel
```
Day 2: `apps/server/src/services/composio.ts` — thin wrapper around Composio SDK: `listToolkits()`, `createConnection(user_id, toolkit)`, `executeAction(connection_id, action, params)`

Day 3: Schema: `user_connections` table (id, user_id, app_id, toolkit, composio_connection_id, status, created_at, last_used_at)

Day 4: Creator-side: in `/p/:slug/dashboard`, add "Connect a tool" section. Creator picks Gmail, we create a Composio integration scoped to their app.

Day 5: User-side: when a user runs an app that requires Gmail, we check `user_connections`. If none, we redirect to Composio's auth URL, handle the callback, store the connection ID.

**Week 2 — Full catalog + polish + launch**

Day 6: Enable remaining 12 integrations (Sheets, Calendar, Slack, Notion, Stripe, HubSpot, Shopify, Airtable, Linear, GitHub, Figma, Sentry, Vercel, X)

Day 7: `/me/connections` page: list all connections, show last used, revoke flow

Day 8: Error handling: token expired, permission denied, Composio down, provider rate limit hit

Day 9: Launch the "Connect a tool" feature, announce on `floom.dev` changelog, DM 10 beta creators

Day 10: Monitor for issues, respond to any OAuth flow bugs, update docs

### Testing protocol

**Technical**

- Each of the 15 OAuth flows tested end-to-end manually
- Token refresh: force a 4-hour wait, verify Composio refreshes transparently
- Token revocation: user revokes Gmail connection, next call returns 401 with "reconnect" prompt
- Expired token mid-run: Composio SDK auto-refreshes, run completes
- Permission denied: user grants read-only but app requests write, clear error
- Composio down (mock): app shows "tool temporarily unavailable", not a 500
- Provider rate limit: Gmail API rate limit hit, backoff + user notification

**Product QA — Maria walk**

1. Installs a Floom app that needs Gmail ("Email Drafts Assistant")
2. First run prompts "Connect your Gmail"
3. Clicks through, lands on Google consent screen (via Composio)
4. Grants permission, redirected back to Floom
5. Run completes, she sees her Gmail drafts
6. Second run: no re-auth prompt, just works
7. Goes to `/me/connections`, sees Gmail listed with "Last used 2 minutes ago"
8. Revokes, tries to run again, gets "Please reconnect Gmail" prompt

**Product QA — Jannik walk**

1. Builds a new app that uses Notion API
2. In dashboard, "Connect a tool" → "Notion" → Composio SDK generates the connection scope
3. Publishes the app
4. Maria installs it, goes through Notion OAuth, works
5. Jannik sees in his analytics: 50% of installs completed the Notion OAuth, 50% dropped off
6. Adds a pre-OAuth explainer screen to improve completion rate

**Unhappy paths**

| Scenario | Expected behavior |
|---|---|
| Composio API down | Circuit breaker trips after 3 failures, app shows "tool temporarily unavailable" with ETA, queued retry via Inngest |
| OAuth provider (e.g. Google) down | User sees provider error, can retry; Composio logs the failure |
| Token revoked by user (from provider side, not Floom) | Next call 401, prompt to reconnect |
| Token expired, Composio refresh fails | Fall back to user-prompt to reconnect, don't fail the run silently |
| Permission scope mismatch | Clear error: "This app needs read+write on your Gmail, you granted read-only. Reconnect with full scope." |
| Rate limit on provider side | Exponential backoff, user-visible "Gmail is rate-limiting, retry in 60s" |
| User connects same tool twice | Composio dedupes by account; we show "already connected" |
| Multi-user app with different tool connections | Each user's connection isolated; app runner uses the calling user's connection, not the creator's |
| Composio free tier exceeded (20K calls/month) | Alert Federico 80% via email, hard block at 100% with clear upgrade prompt, surface-level fallback to direct OAuth for critical tools |
| User cancels mid-OAuth flow | Connection not created, clean cancel state, can retry |

**Edge cases**

- User tries to install an app that uses 5 tools: prompts all 5 OAuths in a wizard, can authorize incrementally, partial failures allowed
- Creator updates their app to need an additional tool: existing users prompted on next run ("This app needs a new permission: Sheets. Authorize?")
- User has two Gmail accounts, wants to connect both: Composio supports multiple connections per toolkit per user, we let user pick which one at run time

### ICP walkthrough

**Maria**: Composio is the moment Maria's trust flips from "cool demo" to "daily tool." She connects her Gmail, Slack, Sheets, and now Floom apps are genuinely useful. What delights her: the OAuth consent screen is familiar (Google's real consent, not a sketchy redirect). What's broken: the first time she runs a multi-tool app she has to authorize 3 times in a row, which feels heavy. What's missing: a "bulk authorize" flow.

**Jannik**: Composio removes the #1 reason his apps don't ship: OAuth boilerplate. He was spending 2-3 days per integration writing OAuth flows; now he ships a new integration in 20 minutes. What delights him: the Composio SDK handles refresh, error mapping, and rate limit backoff for him. What's broken: Composio pricing means he can't offer unlimited free use to his users. What's missing: self-hostable Composio for when Phase 2 users want the same benefit.

**Yash**: Composio is orthogonal to MCP. He can wire a Floom app into Claude Desktop, and when Claude asks for his Gmail, Composio handles the OAuth. The MCP layer doesn't need to know Composio exists. What delights him: no OAuth code in his MCP integration. What's broken: Composio is Cloud-only, so his self-hosted Floom instance can't use it. (Decision: Phase 5 documents how to use Nango as the OSS alternative.)

### Success criteria

- 15 tools connectable from `floom.dev`
- 10 beta creators using at least one integration in a real app
- 0 credential leaks in Sentry or audit logs
- Composio API usage < 50% of free tier in first 30 days

### Time estimate

**2 calendar weeks, 40 focused hours**. This is the easiest phase. Composio does the heavy lifting. The risk is pricing: if beta usage explodes past 20K calls/month, Federico needs to move to the $29/month tier immediately.

### Key risks for this phase

1. **Composio pricing changes**: startup pricing is volatile. Mitigation: keep the Composio wrapper thin, document the Nango fallback path.
2. **Hosted-only restriction**: self-hosters can't use Composio. Mitigation: OSS users see a "Cloud feature" notice and link to the Nango OSS docs.
3. **Rate limits on free tier**: 20K calls/month = ~650/day. Beta can blow through this fast. Mitigation: budget for $29/month immediately, treat it as COGS.
4. **White-label not available on lower tiers**: OAuth consent shows "Composio" branding, which is weird for users. Mitigation: accept it for v1.0, upgrade to Enterprise for white-label when revenue justifies.

---

## 7 · Phase 5 · Public v1.0 launch

### Objective

Public launch where Floom is a real consumer product with real creators, real users, real installs, real testimonials, real press, real revenue.

### What exists before this phase

All of Phases 1-4: landing, OSS core, cloud platform, OAuth integrations. Everything works end-to-end.

### Deliverables

1. **Real creators onboarded** (not seeded): minimum 10 real creators with real apps, not Federico's own. Target list: Jannik (Tuyo), Gourav (OpenPaper), 2-3 from f.inc cohort, 3-5 from YC batch mates, 2-3 from vibecoder Twitter community.
2. **Real testimonials**: 5 filmed testimonials (30-60s each), 10 written quotes
3. **GitHub launch post**: `floomhq/floom-monorepo` goes public, pinned README, stars > 500 before launch day via soft pre-launch
4. **Show HN post**: scheduled for Tuesday 9am PT, drafted + reviewed, co-founder alert to Gourav + Jannik to upvote early
5. **LinkedIn announcement**: Federico's personal account (300K impressions baseline), 3-slide carousel, video, and follow-up thread
6. **Twitter/X thread**: 12-tweet thread with screenshots and video, scheduled Tuesday 9am PT synchronized with HN post
7. **f.inc demo day**: 3-minute live demo of "install an app, it works, here's the platform"
8. **Press outreach**: targeted list of 20 journalists at TechCrunch, The Information, Stratechery, Latent Space, Lenny's Newsletter, Ben's Bites, Bankless. Personal emails (not PR agency).
9. **SEO-optimized docs**: 30+ pages, all indexed, sitemap, internal linking, keyword targets (AI app store, MCP production, vibecoder deploy, open source agent platform)
10. **Onboarding video**: 3-minute product tour filmed in Screen Studio
11. **Case studies**: 3 creators with full write-ups (before Floom, after Floom, metrics)
12. **Trust signals on landing**: GitHub star count widget, "Built by YC f.inc" badge (pending f.inc approval), real logos for beta creators
13. **Monitoring**: Sentry, PostHog, Vercel Analytics, Datadog, Better Stack for uptime, OpsGenie for on-call
14. **Support**: Plain inbox, docs search (Algolia), community Discord, help@floom.dev

### Tech stack decisions

| Choice | Alternative weighed | Why |
|---|---|---|
| **Plain** (support) | Intercom, HelpScout, Zendesk | Developer-friendly, API-first, the waitlist already flows here |
| **Discord** (community) | Slack, Circle, Mighty Networks | Discord is where vibecoders live, free, infinite channels |
| **Better Stack** (uptime) | Pingdom, UptimeRobot, StatusCake | Free tier, pretty status page, Slack alerts |
| **OpsGenie** (on-call) | PagerDuty, Squadcast | Atlassian stack, free for 5 users, phone call escalation |
| **Algolia DocSearch** (docs search) | Typesense, Meilisearch | Free for open-source docs, instant search, fast to wire |

### Week-by-week build steps

**Realistic estimate: 1.5 calendar weeks, 40 focused hours**.

**Week 1 — Content + trust signals + operations**

Day 1: Real creator onboarding blitz. Federico DMs 15 people from his network. Target: 10 committed creators with apps ready to list.

Day 2: Film 5 testimonials (via Riverside or Loom). Write 10 text quotes.

Day 3: Shoot 3-minute onboarding video in Screen Studio: open `floom.dev` → browse store → install an app → run it → see dashboard → rollback a version.

Day 4: Write 3 case studies (Jannik / Gourav / one f.inc creator). Each 800 words + 4 screenshots.

Day 5: SEO audit all docs pages, fix Lighthouse SEO warnings, submit sitemap to Google Search Console + Bing Webmaster Tools.

Day 6: Wire monitoring. Sentry + PostHog + Vercel Analytics + Datadog + Better Stack. Verify an alert fires to OpsGenie on 5xx.

Day 7: Create Discord server, 5 channels (#welcome, #showcase, #help, #building, #bugs), seed with 20 invites to friends.

**Week 2 — Launch day mechanics**

Day 8: Draft + review Show HN post, LinkedIn carousel, Twitter thread, press emails. Send to 3 reviewers (Gourav, Jannik, one marketer friend) for critique.

Day 9: Pre-launch warmup. Soft-publish the GitHub repo (`public` but unannounced). Post a teaser on personal LinkedIn ("Tomorrow I'm launching the thing I've been building"). 15 hours before HN post, notify friends to star the repo.

Day 10: **Launch day**. 9am PT: post on HN, publish LinkedIn carousel, tweet the thread, send 20 press emails. 10am PT: respond to every HN comment for the first 4 hours (highest ranking opportunity). 12pm PT: post in Discord, in f.inc Slack, in YC W26 group chat. 2pm PT: LinkedIn post update with screenshots of engagement. 4pm PT: retrospective + next-day plan.

Day 11: Incident response. Expect something to break. Have rollback plan ready. Federico + one backup (Gourav or Jannik) on-call for 24h.

Day 12: Post-launch: thank-you email to every waitlist signup, first-week retention metrics in PostHog, list of top 5 feedback themes, plan for v1.1.

### Testing protocol

**Technical**

- Uptime > 99.9% during launch day (Better Stack)
- No critical Sentry errors in first 24h
- p95 latency < 1s globally (Vercel + Cloud Run)
- Docs search returns relevant results for top 20 queries
- Every link on landing verified (lychee)
- Social share previews (OG images) tested on Twitter, LinkedIn, Telegram, WhatsApp, Slack, Discord

**Product QA**

- 10 real creators can each onboard from scratch in <10 minutes (measured)
- 5 users (recruited from waitlist) each successfully install and run one app in <5 minutes
- Onboarding video retention > 60% (PostHog session recording)
- Waitlist → signup conversion > 20% in first 7 days

**Unhappy paths**

- Traffic spike 100x baseline: Vercel + Cloud Run handle it via autoscale
- Top HN comment is "this is just n8n/Langfuse/X": Federico has a prepared counter in CLAUDE.md-style debate notes
- A creator's app fails publicly during launch: rollback feature works, Federico publicly demonstrates it as a trust signal
- Sentry floods with a known bug: scripted auto-reply "We see it, fix shipping in 1h"
- Composio down during launch: fallback to the 5 apps that don't need OAuth
- Postgres connection pool exhausted: Neon has autoscale, Inngest absorbs spikes

**Security**

- Final dependency audit: `pnpm audit --audit-level=high` returns 0
- Secrets scan: `gitleaks detect --source . --log-opts="HEAD"` returns 0
- Rate limiting enforced: 100 req/min/IP on all public endpoints
- No admin endpoints exposed without auth
- CSP headers set to prevent XSS

### ICP walkthrough

**Maria**: Launch day, she finds Floom via Ben's Bites or a Twitter thread, lands on `floom.dev`, sees the clean pitch, clicks into OpenPaper, signs up with Google, installs, runs a real query, gets results, tweets about it. **The test**: can she do this entire flow in 5 minutes without confusion?

**Jannik**: Launch day, he's featured as one of the 10 real creators on the front page. His app tuyo.jannik.dev sees 100 installs in the first 24h, $50 in first-day revenue via Stripe Connect. **The test**: is the creator share of revenue obvious and trustworthy?

**Yash**: Launch day, developer audience finds the GitHub repo, stars it (target > 2000 stars week 1), clones and runs the Docker image, wires it into Claude Desktop, tweets about the MCP integration. **The test**: is the open-source story clear and well-documented?

### Success criteria

- 100 real creators onboarded in first 30 days
- 10 paying creators (Cloud Pro or Cloud Team)
- 1,000 real end users
- 10,000 total app installs (the sum across all creators)
- 0 P0 incidents in first 30 days
- GitHub stars > 2,000 week 1
- Show HN: top 10 on launch day
- LinkedIn announcement: > 300k impressions
- Press: 3+ articles written

### Time estimate

**1.5 calendar weeks, 40 focused hours** for launch prep, then 2 weeks of high-attention incident response post-launch.

### Key risks for this phase

1. **Bad launch timing**: launching during a slow news week is gold; launching the day of an OpenAI event is death. Check tech calendar.
2. **Creator supply doesn't materialize**: if Federico can't get 10 real creators, launch is premature. Hard gate.
3. **Composio outage during launch**: most demos depend on OAuth. Mitigation: have 3 demo apps that do not require Composio, ready as fallback.
4. **HN downvoted**: risk if the title is wrong. Mitigation: draft 5 title candidates, A/B test via friends.
5. **Burnout immediately post-launch**: Federico is a solo founder, already stretched, f.inc programming starts in SF. Mitigation: schedule one recovery week post-launch, no new features.

---

## 8 · Cross-cutting: testing framework

The single testing stack used across every phase. Consistency matters more than coverage.

### Unit tests

- **Vitest** for every new function, target 80% line coverage on `services/*.ts`, 100% on middleware
- **Property-based testing** via `fast-check` for parser + manifest validation (untrusted input)
- Run in CI on every PR; block merge on failure

```bash
pnpm --filter @floom/server test
pnpm --filter @floom/server test:coverage
```

### Integration tests

- Every Hono API endpoint tested with `supertest`-style calls via `@hono/testing`
- DB reset between tests (drop + recreate SQLite in-memory)
- Auth flows tested with real Better Auth instance, not mocks

### E2E tests

- **Playwright** for full flows
- Headless Chromium by default, one weekly run in Firefox + WebKit
- Run on every PR against a Docker Compose stack (server + web + a mock OAuth provider)
- Visual regression via Playwright screenshots (pixel-perfect diffs)

### Product QA

A living **QA checklist** per ICP per phase, stored in `/tmp/floom-wireframes/QA-CHECKLIST.md`. Each row has a status (Untested / Passed / Failed). Federico walks through it personally before declaring a phase done.

Structure (for each phase):

| ICP | Flow | Steps | Expected | Actual | Status |
|---|---|---|---|---|---|
| Maria | Install an app | Open → Browse → Install → Run | Result appears in 10s | | |

20 questions per ICP per phase = 60 rows per phase = 300 rows total over 5 phases.

### Accessibility

- **WCAG 2.2 AA minimum** enforced via `axe-core` in Playwright tests
- Keyboard navigation: every interactive element reachable via Tab
- Screen reader: VoiceOver walkthrough on every screen before phase ship
- Color contrast: 4.5:1 for text, 3:1 for UI elements (Floom palette passes because of the high-contrast `--text` on `--bg`)
- Reduced motion: `prefers-reduced-motion` respected on animations

### Security

- **Static analysis**: `eslint-plugin-security`, `semgrep` with OWASP rules
- **Dependency audit**: `pnpm audit --audit-level=high` on every PR
- **Secret scanning**: `gitleaks detect --source . --log-opts="HEAD"` on every push (already in Federico's pre-push hook)
- **Manual review** of every auth code change by Federico + one reviewer (Gourav or Jannik)
- **Penetration test**: third-party, before Phase 3 cloud launch, $5k budget
- **Security headers**: CSP, HSTS, X-Content-Type-Options, X-Frame-Options, Referrer-Policy, Permissions-Policy

### Performance

- **Lighthouse**: >= 90 on every landing page route
- **Core Web Vitals**: LCP < 2.5s, INP < 200ms, CLS < 0.1
- **Backend p95**: `/api/hub` < 200ms, `/api/:slug/run` dispatch < 500ms, `/mcp/app/:slug/tools/list` < 100ms
- **Error rate**: < 1% of requests
- **Uptime**: 99.9% post-launch

### Chaos

Scheduled chaos runs (once per phase):

| Test | Expected |
|---|---|
| Kill runner container mid-run | Run marked `crashed`, no DB corruption, new container boots |
| Fill disk to 95% | New runs rejected with clear error, existing runs don't corrupt |
| DNS failure to `ghcr.io` | Boot from cached image |
| Postgres connection pool exhausted (cloud) | Retry with backoff, 503 temp, Neon autoscale absorbs |
| Composio API 503 (Phase 4+) | Circuit breaker, fallback, user-visible "temporarily unavailable" |
| Inngest delay > 5 minutes | Fallback to in-process retry, alert fires |

---

## 9 · Cross-cutting: ICP walkthrough checklist

For each phase, run this 20-question walkthrough per ICP before declaring the phase done.

### Maria (biz user, ops at a 30-person SaaS)

1. Can she find `floom.dev` via a search / referral / link?
2. Does the hero explain what Floom is in under 10 seconds?
3. Can she browse the store without signing up?
4. Can she click into an app and see screenshots + reviews + install CTA?
5. Does the "Production-safe · powered by Floom" badge reassure her?
6. Can she install an app in one click (Phase 2+)?
7. Can she run an app without reading docs (Phase 2+)?
8. Does the result render nicely (markdown, cards, files)?
9. Can she re-run with different inputs without friction?
10. Can she see her run history in `/me` (Phase 2+)?
11. Can she schedule an app to run daily (Phase 2+)?
12. Can she connect her Gmail / Slack / Sheets (Phase 4)?
13. Can she rate and review an app (Phase 2+)?
14. Can she report a bug that reaches the creator's inbox (Phase 2+)?
15. Can she join a shared workspace and see her team's installed apps (Phase 3)?
16. Does the SSO flow work with her Google Workspace account (Phase 3)?
17. Is her data encrypted + audited per her security team's requirements (Phase 3)?
18. Can she cancel her account and delete her data (GDPR)?
19. Does the mobile experience match desktop for reading + running?
20. If something breaks, does she get a clear error + way to contact support?

### Jannik (vibecoder, Cursor + Claude Code)

1. Does the creator CTA feel equal to the user CTA on the home page?
2. Can he see the 13 production-layer guarantees (auth, access, logs, etc.)?
3. Can he deploy an app from his GitHub repo in <5 minutes (Phase 2+)?
4. Does the deploy log stream live?
5. Can he see his app's URL + install command as soon as deploy finishes?
6. Can he configure auth (who can use it) in <2 minutes?
7. Can he generate an API key scoped to a single tool?
8. Can he set a rate limit that actually enforces?
9. Can he see activity logs filtered by caller + status?
10. Can he see usage charts (runs/day, success rate, avg duration)?
11. Can he schedule his app to run with a cron expression?
12. Can he hook up a webhook to Slack and verify delivery?
13. Can he add a custom domain (BYO DNS in OSS, managed in Cloud)?
14. Can he roll back to the previous version in 30 seconds?
15. Can he see his feedback inbox + respond to users (Phase 2+)?
16. Can he see reviews on his app + respond to them?
17. Can he list his app for a price and take payment (Phase 3 via Stripe Connect)?
18. Does advanced analytics (cohorts + funnels) help him understand users (Phase 3)?
19. Can he see audit logs per caller for compliance (Phase 3)?
20. Can he connect his app to Composio tools (Gmail, Notion) in 5 minutes (Phase 4)?

### Yash (developer via MCP)

1. Can he find the open-source repo on GitHub in 30 seconds?
2. Does the README explain self-host in <1 minute?
3. Can he `docker run` the image and get a working MCP endpoint?
4. Does Claude Desktop connect to `/mcp/app/:slug` successfully?
5. Can he list tools via MCP handshake?
6. Can he call a tool and get a result?
7. Does the MCP integration work with streaming output?
8. Can he generate an API key via CLI (Phase 2+)?
9. Can he scope the API key to specific tools?
10. Can he see audit logs per API key (Phase 2+)?
11. Can he configure per-tool rate limits?
12. Can he self-host with Postgres instead of SQLite (Phase 2+)?
13. Can he integrate Floom into Cursor, Cline, Claude Desktop, Continue.dev?
14. Can he write an app in Python 3.12 and deploy via Docker?
15. Can he write an app in Node 20 and deploy via Docker?
16. Can he write a proxied app (OpenAPI spec) in <5 minutes?
17. Does the MCP response schema match what Claude expects (stable `tools/list` format)?
18. Can he contribute a PR (contributing guide, code of conduct, issue templates)?
19. Can he read the protocol spec and understand the full surface?
20. Does the Apache 2.0 license let him ship to his employer (Phase 2 OSS) without legal review?

---

## 10 · Dependency graph

```
         Phase 1 (Landing + waitlist)
                    |
                    v
         Phase 2 (Docker OSS core)
                    |
                    v
         Phase 3 (Cloud multi-tenant wrap)
                    |
                    v
         Phase 4 (Composio OAuth)
                    |
                    v
         Phase 5 (Public v1.0 launch)
```

**Explicit dependency rules:**

- **Phase 1 has no dependencies.** Can ship tomorrow in principle. Blocked only by DNS ownership of `floom.dev`.
- **Phase 2 depends on Phase 1's waitlist** because launch-day demand must be captured before the product is ready. It also depends on the existing `floom-monorepo` v0.1.0 Docker image as the starting point.
- **Phase 3 depends on Phase 2's Docker core** because the cloud wraps the OSS primitives. If Phase 2's auth, access, schedules, webhooks aren't production-ready, cloud just duplicates broken features.
- **Phase 4 depends on Phase 3's cloud** because Composio is Cloud-only (hosted, paid). OSS users get Nango as an alternative, documented in `docs/self-host-oauth.md`.
- **Phase 5 depends on all prior phases**. Launch without real creators (Phase 2+) or real users (Phase 3+) or working OAuth (Phase 4) is a half-launch that burns the narrative.

**Parallelizable work within phases:**
- Phase 1: frontend (Day 1-6) and backend wiring (Day 7) are parallel if a second engineer joins
- Phase 2: auth (week 1) and schedules/webhooks (week 5) can be parallelized with a second engineer
- Phase 3: SSO and billing are parallelizable; avoid if solo
- Phase 4: all integrations after the first 3 are parallelizable across sub-tasks

**Non-parallelizable blockers:**
- Auth (Phase 2 week 1) blocks all subsequent Phase 2 features
- Drizzle migration (Phase 2 week 1 day 2) blocks all schema changes
- Cloudflare SSL-for-SaaS (Phase 3 week 3) blocks custom domain feature
- Stripe Connect onboarding (Phase 3 week 5) blocks billing launch

---

## 11 · Critical risks + mitigation

| # | Risk | Impact | Likelihood | Mitigation |
|---|---|---|---|---|
| 1 | **Engineering capacity**: solo founder, f.inc + SF move eating calendar | 5 | 5 | Phase 2 is the dominant block, protect it with a `WORKPLAN-FLOOM-PHASE2.md`, refuse OpenPaper/FlyFast feature work during weeks 4-12, Ralph-loop the hard parts |
| 2 | **Vercel Workflow or similar eats the creator side**: durable workflows become commoditized | 4 | 3 | Floom's moat is the install layer + multi-tenant store, NOT the runtime. Double down on the store / distribution story, treat the runtime as interchangeable |
| 3 | **Composio changes pricing or deprecates** | 3 | 3 | Keep the Composio wrapper thin (one `services/composio.ts` file), document a Nango-based fallback, plan to migrate within 1 week if forced |
| 4 | **OpenAPI spec is too narrow for real AI apps** (agents want state, streaming, long-running tools) | 4 | 2 | Support native manifests (v2.0 multi-action format) alongside OpenAPI, the existing runtime already handles both |
| 5 | **Supabase / Neon scaling at Cloud phase** | 3 | 2 | Pick Neon for Postgres (serverless scales to zero, branching for previews), fall back to managed RDS if needed. Avoid Supabase at the data layer (keep it for waitlist only) |
| 6 | **Custom domains SSL failures** (Phase 3) | 4 | 3 | Cloudflare SSL-for-SaaS is the mitigation, but monitor cert provisioning, have a manual override, on-call paging for cert expirations |
| 7 | **Creator supply doesn't materialize** (< 10 real creators at Phase 5 launch) | 5 | 4 | Start building creator pipeline in Phase 2, DM 3 creators per week, pay $500 to the first 10 creators to port their apps, create a "founding creator" program |
| 8 | **User demand doesn't materialize** (cold-start marketplace problem) | 5 | 3 | Focus Phase 5 launch on creator narrative (not user narrative), let users pull in organically, use Federico's 300K LinkedIn baseline as the demand engine |
| 9 | **Security incident in early days** (a creator's app leaks data via Floom) | 5 | 2 | Security review in Phase 2 week 8, pentest in Phase 3 week 6, bug bounty open from Phase 5, incident response runbook ready, insurance policy active |
| 10 | **Founder burnout / depression during 4-month build** | 5 | 4 | One full recovery week between Phase 2 and Phase 3, no-meeting days twice per week, one day off per weekend minimum, Federico has history of burnout post-SCAILE — this is real |
| 11 | **`floom.dev` domain ownership disputed** with Vlad's version | 4 | 3 | Clarify with Vlad before Phase 1 Day 10, fall back to `app.floom.dev` or `run.floom.dev` if needed |
| 12 | **Better Auth instability** (merged Auth.js recently) | 3 | 2 | Pin version, read changelog weekly, have a Lucia-style-fork backup plan documented |
| 13 | **Docker image size grows past 1GB** | 2 | 3 | Multi-stage build, Alpine base, prune dev deps, audit weekly |
| 14 | **SOC 2 scope creep** delays cloud launch | 4 | 3 | Type 1 only for v1.0, Type 2 post-launch, accept enterprise customers can wait |
| 15 | **f.inc programming eats Phase 3 entirely** | 4 | 4 | Negotiate a part-time schedule with f.inc, or accept Phase 3 slips to June-July 2026 |

Top 3 existential risks: **capacity (#1)**, **creator supply (#7)**, **burnout (#10)**. These kill the roadmap more than any technical decision.

---

## 12 · Decision log (open questions per phase)

### Before Phase 1

| # | Decision | Options | Recommended | Deadline |
|---|---|---|---|---|
| 1.1 | Which domain for landing | `floom.dev` (Vlad conflict), `app.floom.dev`, new domain | Clarify with Vlad; if stuck, use `app.floom.dev` | Day 1 |
| 1.2 | Waitlist DB: Supabase vs Airtable vs Plain | Supabase (recommended), Airtable, Plain | Supabase (reusable later) | Day 6 |
| 1.3 | Analytics stack: PostHog vs Mixpanel vs Vercel only | PostHog (free, OSS, recommended) | PostHog + Vercel Analytics | Day 8 |
| 1.4 | Docs framework: Nextra vs next-mdx-remote vs Mintlify | next-mdx-remote (lightest), Nextra (richest), Mintlify (hosted) | next-mdx-remote | Day 5 |
| 1.5 | Hero copy: agent-first vs production-first | "Production layer for vibe-coded AI apps" vs "World's first app store for agents" | Run both via the positioning + launch narrative memos | Day 2 |

### Before Phase 2

| # | Decision | Options | Recommended | Deadline |
|---|---|---|---|---|
| 2.1 | Auth library: Better Auth vs Auth.js v5 vs Lucia | Better Auth (only viable option) | Better Auth | Phase 2 Day 1 |
| 2.2 | ORM: Drizzle vs Prisma vs raw SQL | Drizzle (best fit with Hono + Better Auth) | Drizzle | Phase 2 Day 1 |
| 2.3 | Schedules engine: node-cron vs BullMQ vs Inngest | node-cron (OSS), Inngest (Cloud only) | Both, via driver | Phase 2 Week 5 |
| 2.4 | Email: Resend vs Postmark vs BYO SMTP | Resend (cloud) + Nodemailer (self-host) | Both | Phase 2 Week 7 |
| 2.5 | Migrations: drizzle-kit vs hand-rolled | drizzle-kit | drizzle-kit | Phase 2 Day 2 |
| 2.6 | Testing framework: Vitest vs Jest | Vitest (already in repo, faster) | Vitest | Phase 2 Day 1 |
| 2.7 | E2E: Playwright vs Cypress | Playwright (standard) | Playwright | Phase 2 Day 10 |
| 2.8 | OSS license | MIT vs Apache 2.0 | Apache 2.0 (patent grant, enterprise-friendly) — note: current README says MIT, change before Phase 2 public cut | Before Phase 2 launch |

### Before Phase 3

| # | Decision | Options | Recommended | Deadline |
|---|---|---|---|---|
| 3.1 | Hosting: Vercel + GCP Cloud Run vs Fly.io everything vs Railway | Vercel + Cloud Run (matches OpenPaper) | Vercel + Cloud Run | Phase 3 Day 1 |
| 3.2 | Managed Postgres: Neon vs Supabase vs RDS | Neon (branching, serverless) | Neon | Phase 3 Day 1 |
| 3.3 | Workflow engine: Inngest vs Trigger.dev vs Temporal | Inngest (Next.js-native, free tier) | Inngest | Phase 3 Week 5 |
| 3.4 | SSO: which first? Google Workspace vs Okta vs Microsoft | Google Workspace (easiest, dogfood) | Google Workspace | Phase 3 Week 2 |
| 3.5 | Managed domains: Cloudflare SSL-for-SaaS vs Let's Encrypt + Caddy | Cloudflare (pay-as-you-go) | Cloudflare | Phase 3 Week 3 |
| 3.6 | Billing: Stripe vs Paddle vs LemonSqueezy | Stripe (muscle memory) | Stripe | Phase 3 Week 5 |
| 3.7 | Rate limiter store: Upstash Redis vs in-memory | Upstash (multi-instance cloud) | Upstash | Phase 3 Week 1 |
| 3.8 | SOC 2: Type 1 only vs Type 2 before launch | Type 1 only | Type 1 only | Phase 3 Week 6 |
| 3.9 | Third-party pentest vendor | Trail of Bits, Doyensec, Hacken, freelance via Cobalt.io | Cobalt.io or solo consultant at $5k | Phase 3 Week 6 |

### Before Phase 4

| # | Decision | Options | Recommended | Deadline |
|---|---|---|---|---|
| 4.1 | OAuth platform: Composio vs Nango vs Pipedream | Composio (Cloud), Nango (OSS fallback) | Both, via driver | Phase 4 Day 1 |
| 4.2 | Composio pricing tier | Free (20K calls), Ridiculously Cheap ($29), Serious ($229) | Start at Free, upgrade to $29 at 80% | Phase 4 Day 1 |
| 4.3 | Integration priority order | Gmail first, Notion next? | Gmail + Slack + Sheets + Notion + Stripe first | Phase 4 Day 1 |
| 4.4 | White-label OAuth screens | Composio Enterprise ($$$) vs accept Composio branding | Accept Composio branding for v1.0 | Phase 4 Day 1 |

### Before Phase 5

| # | Decision | Options | Recommended | Deadline |
|---|---|---|---|---|
| 5.1 | Launch day: Show HN vs Product Hunt vs LinkedIn first | Show HN morning + LinkedIn same day | Show HN first | Phase 5 Week 1 |
| 5.2 | Press strategy: PR agency vs solo outreach | Solo outreach (Federico's network) | Solo outreach | Phase 5 Week 1 |
| 5.3 | Community platform: Discord vs Slack vs Circle | Discord (vibecoder audience) | Discord | Phase 5 Week 1 |
| 5.4 | Support inbox: Plain vs Intercom vs email | Plain (already used for waitlist) | Plain | Phase 5 Week 1 |
| 5.5 | Launch price (Cloud Pro tier) | $9, $19, $29, $49 | $29/month (matches Composio, matches Rocketlist-tier SaaS) | Phase 5 Week 1 |
| 5.6 | Paid creator program: bounty per listing? | $500 bounty for first 10 creators | Yes, $5k total budget | Phase 5 Week 1 |
| 5.7 | YC / f.inc demo day alignment | Launch before, during, or after demo day | Align with f.inc demo day for press leverage | Phase 5 Week 1 |

---

## 13 · Launch-day checklist

50+ items. Federico runs this the day before launch. Every line is either ☑ or blocked.

### T-minus 7 days

- [ ] All 17 landing page routes Lighthouse >= 90
- [ ] All Playwright E2E tests green on main
- [ ] All Vitest unit + integration tests green on main
- [ ] All Docker image tests green (Ubuntu, Debian, Alpine)
- [ ] `ghcr.io/floomhq/floom-monorepo:v1.0.0` tagged and pushed
- [ ] Neon Postgres prod instance provisioned, backup verified
- [ ] Supabase waitlist project frozen (read-only after launch)
- [ ] Vercel production deployment verified with real DNS
- [ ] Cloud Run backend deployment verified with real DNS
- [ ] SSL certs verified on all domains (`openssl s_client`)
- [ ] Upstash Redis prod instance provisioned, quotas set
- [ ] Sentry project configured, release tracking enabled
- [ ] PostHog project configured, cohorts defined
- [ ] Vercel Analytics dashboards configured
- [ ] Datadog monitors configured for Cloud Run + Neon
- [ ] Better Stack status page public at `status.floom.dev`
- [ ] OpsGenie on-call rotation set (Federico primary, Gourav backup)

### T-minus 3 days

- [ ] 10 real creators confirmed + onboarded
- [ ] 10 creator apps live and tested on production
- [ ] 5 testimonial videos filmed + edited
- [ ] 3 case study pages published
- [ ] Onboarding video embedded on landing
- [ ] Show HN post draft reviewed by 3 people
- [ ] LinkedIn post + carousel reviewed
- [ ] Twitter thread + visuals reviewed
- [ ] Press emails drafted (20 journalists)
- [ ] Demo video uploaded + linked
- [ ] Pricing page final + Stripe products created
- [ ] Stripe webhooks tested end-to-end
- [ ] Composio free tier usage < 30% (buffer for launch spike)
- [ ] All 15 Composio integrations tested end-to-end
- [ ] `docs/SELF_HOST.md` verified on a fresh Ubuntu box

### T-minus 1 day

- [ ] `gitleaks detect` clean
- [ ] `pnpm audit --audit-level=high` clean
- [ ] Final dependency audit via Socket.dev
- [ ] Rollback plan documented (`docs/ROLLBACK.md`)
- [ ] Incident response runbook reviewed
- [ ] Security headers verified (CSP, HSTS, etc.) via `securityheaders.com`
- [ ] Rate limiters verified under load
- [ ] Uptime checks pinging every 30s from 5 regions
- [ ] On-call phone ringer tested (OpsGenie call routing)
- [ ] Federico's personal phone charger + backup Mac near desk
- [ ] Gourav / Jannik ping notifying they're on backup
- [ ] Coffee, snacks, water stocked (real, this matters for 18-hour launch days)
- [ ] Sleep 8 hours the night before

### Launch day (T+0)

**0600 PT**: Federico wakes up, checks uptime + Sentry, coffee

**0800 PT**: Pre-launch ping to friends, ask them to be ready to upvote at 0900 PT

**0900 PT**: Post on Show HN, publish LinkedIn carousel, tweet Twitter thread, send 20 press emails

**0905 PT**: Ping Discord + Slack + f.inc + YC groups

**0915 PT**: Monitor HN position, respond to first comments with care

**1000-1400 PT**: Respond to every HN comment within 15 minutes. Respond to every Twitter reply. Respond to every LinkedIn comment. This is THE window for HN ranking.

**1400 PT**: LinkedIn update post with launch metrics (HN position, stars, signups)

**1600 PT**: Check press: any journalist replies? Follow up with any who showed interest.

**1800 PT**: First daily retro: what worked, what broke, what's next.

**2000 PT**: Sleep 8 hours, do NOT stay up refreshing HN.

### T+1, T+2, T+7

- [ ] Respond to every new signup personally within 24h for first 3 days
- [ ] Publish "Day 1 retro" post on LinkedIn
- [ ] Track Sentry for P0 bugs, ship hotfixes within 4h
- [ ] Daily standup with self: what's the #1 thing today
- [ ] Track metrics: signups, activations, runs, GitHub stars
- [ ] Week-1 retro post + v1.1 roadmap announcement

---

## 14 · Appendix: tech stack rationale

Full defense of every stack choice, in one place, for when Federico (or a future hire) asks "why this and not that."

### Landing (Phase 1)

| Choice | Alternatives | 3-sentence defense |
|---|---|---|
| Next.js 15 App Router | Astro, Remix, pure HTML | Federico ships fastest in Next.js because of muscle memory and the Vercel integration; App Router handles the 17 static routes with ISR for free; shadcn is first-class in Next.js. |
| Tailwind 4 | Tailwind 3, vanilla CSS, CSS modules | v4 is GA, faster builds via Lightning CSS, simpler config, and the v8 wireframe already uses Tailwind utilities. |
| shadcn/ui | Radix bare, HeadlessUI, DaisyUI | shadcn gives ownership of components as copy-pasted source, which matters because we'll customize aggressively to match the Floom design tokens. |
| Supabase (waitlist) | Airtable, Google Sheets, Plain | SQL beats spreadsheets for segmentation later, Federico already uses Supabase, free tier is generous, and we can reuse the project for Phase 2 auth if we pivot. |
| Vercel hosting | Cloudflare Pages, Netlify, AX41 | Preview deploys per PR are free, edge routing is global, Federico has the CLI workflow memorized. |

### OSS core (Phase 2)

| Choice | Alternatives | 3-sentence defense |
|---|---|---|
| Better Auth | Auth.js v5 (merged into Better Auth), Lucia (deprecated), Clerk (hosted-only), WorkOS (enterprise price) | Only library that supports SQLite + Postgres, email + OAuth + magic link + SAML + passkeys, Hono + Next.js adapters, and self-hosting without a SaaS lock-in. Auth.js merged into Better Auth as of 2026, which means the community is converging. |
| Drizzle ORM | Prisma, Kysely, raw better-sqlite3 | Best Hono + SQLite + Postgres story, TypeScript-native schema, readable SQL migrations, and Federico already uses it in Rocketlist. Prisma bloats Docker images with its separate client generation step. |
| node-cron (OSS schedules) | BullMQ + Redis, Temporal, Inngest | Adding Redis breaks the "single docker run" promise; node-cron runs in-process, uses SQLite, fires runs via the existing runner. For Cloud we swap to Inngest via a driver. |
| Zod | Yup, Valibot, io-ts | Already in the repo, Hono native, Better Auth uses it, Drizzle-zod bridges schema to validators. |
| Vitest | Jest, Node test runner | Fastest, pnpm-native, ESM-native, Federico already uses it. |
| Playwright | Cypress | Federico's default, supports Hono + Vite + Docker runner flows, free and maintained by Microsoft. |
| Git-as-versions | Custom blob store, S3 snapshots | Apps deploy from Git, so versions ARE git SHAs; rollback is `git checkout`; zero new storage to build or back up. |
| drizzle-kit migrations | Hand-rolled SQL in db.ts | Current inline `db.exec()` pattern does not scale past 13 tables; drizzle-kit generates migrations from schema, commits to git, runs on startup. |

### Cloud (Phase 3)

| Choice | Alternatives | 3-sentence defense |
|---|---|---|
| Vercel (frontend) | Cloudflare Pages, Netlify, self-host | Decided in Phase 1, no reason to switch. Preview deploys, edge routing, analytics all built in. |
| GCP Cloud Run (backend) | AWS Lambda, Fly.io, K8s | Matches OpenPaper's deployment pattern (same project `core-planet-486109-r5`), Federico has the gcloud workflow memorized, scales to zero, HTTPS + custom domains free. |
| Neon (Postgres) | Supabase Postgres, RDS, Crunchy | Branching support for per-PR preview DBs is a killer feature, serverless scales to zero, free tier 0.5GB. Supabase is fine but Neon's branching better aligns with Vercel preview deploys. |
| Cloudflare SSL-for-SaaS | Let's Encrypt + Caddy, AWS ACM | One API call provisions cert + DNS + edge routing, costs $0.05/cert/month, handles wildcard SSL automatically. Building this ourselves is at least 2 weeks of work we don't have. |
| Upstash Redis | Redis Cloud, ElastiCache | Serverless + REST API means no connection pool hell on Cloud Run, free tier 10k commands/day. |
| Stripe | Paddle, LemonSqueezy | Federico has it working in Rocketlist and OpenPaper, Stripe Connect handles the partner app billing for creators, muscle memory matters under launch pressure. |
| Inngest | BullMQ, Trigger.dev, Temporal | Next.js-native, durable against Cloud Run container cycling, generous free tier, works alongside the OSS node-cron driver. |
| PostHog | Mixpanel, Amplitude | Decided in Phase 1, free OSS tier, product analytics + session replay + feature flags in one product. |
| Sentry | Rollbar, Bugsnag | Industry default, Next.js integration is one click, release tracking is built in. |
| Plain | Intercom, HelpScout | Developer-friendly, API-first, Federico already uses it for waitlist, and support inbox transitions from waitlist naturally. |

### OAuth (Phase 4)

| Choice | Alternatives | 3-sentence defense |
|---|---|---|
| Composio | Nango, Pipedream Connect, Merge.dev, custom | 1000+ tools, TypeScript SDK, free tier 20K calls/month, managed OAuth + token refresh + rate limiting. Alternatives: Nango is open-source self-hostable and stays as OSS fallback; Pipedream is more workflow-focused; Merge targets HRIS/CRM. |
| Nango (OSS fallback) | — | For self-hosters who can't use Composio, Nango is the only OSS alternative with comparable integration coverage. Documented in `docs/self-host-oauth.md`. |

### Launch (Phase 5)

| Choice | Alternatives | 3-sentence defense |
|---|---|---|
| Discord | Slack, Circle | Where vibecoders live, infinite free channels, easy invite links. |
| Better Stack | Pingdom, UptimeRobot | Free tier, pretty status page, Slack alerts, public status page on subdomain. |
| OpsGenie | PagerDuty, Squadcast | Atlassian stack, free for 5 users, phone call escalation to Federico's number. |
| Algolia DocSearch | Typesense, Meilisearch | Free for open-source docs, instant search, 2-hour wire time. |
| Screen Studio | Loom, Camtasia | Mac-native, highest quality demo videos, automatic zoom + mouse trail, worth the one-time cost. |

---

## Closing notes

This roadmap is designed to ship. Each phase has a clean entry point and a clean exit criterion. Each phase has realistic time estimates factoring solo founder constraints (25 focused hours per week, ~35% calendar loss to meetings, investor chatter, f.inc programming, the SF move). Each phase has an ICP walkthrough template so Federico can QA without guessing.

If any phase slips by more than 50%, stop and re-plan. Do not push through. Re-read the workplan, reassess, and either pivot or flag it.

The hardest single block of work is Phase 2 (8 weeks, 200 hours of focused dev). If Phase 2 doesn't ship cleanly, Phase 3 has nothing to wrap. Protect Phase 2 with a dedicated workplan, no scope creep, and weekly check-ins against the week-by-week build steps in Section 4.

**The most important rule**: the goal is not perfect. The goal is shipped, by real users, with real money changing hands. If Federico is iterating on wireframes in week 18, something has gone wrong. If Federico has 10 real creators and 100 real users by end of Phase 5, the roadmap succeeded regardless of what the product looks like.

Ship it.