GPUS-IT Priorities¶
Classification: CONFIDENTIAL — Internal Use Only Document:
priorities/gpus-it-priorities.md· v1.13 · 2026-05-21 · GPUS-IT Owner: Rajesh Chhetry · Review cadence: weekly (Fridays) + on every new initiative
Purpose¶
Living tracker of active and queued initiatives for the GPUS-IT infrastructure program. This document supersedes scattered priority notes across memory and Cowork; the canonical order of work lives here.
Rules of engagement:
- Every initiative lists its gating dependencies — don't start work that is gated on unfinished prerequisites.
- Every new asset added by an initiative must ship with its matching IR runbook (
rb-00N-*.md), DR procedure update indrp.md, and red/blue drill entry intabletop-playbooks.md+blue-team-drills.md. Infra docs without runbooks are not considered "done". - Status values:
Planned·In Progress·Blocked·Done·Deferred - When an item completes, move it to Completed with the completion date and delete it from the active queue.
Sequencing¶
The order is deliberate: finish forms, then Meraki, then WDC foundation.
- Forms portal — Phase 2 SPA → SQLi drill → Phase 3 HappyFox → Phase 1.5 migration → Phase 1.6 refactor. Everything else is gated until forms is production-complete.
- Meraki — Sedita fix + Meraki SSO, then fold Meraki into status/SOC/MkDocs coverage.
- WDC foundation — ESXi + Synology NAS inventory, cleanup, monitoring, and documentation. Deferred intentionally: the hypervisor + NAS are stable enough today that chasing forms-portal momentum is higher-value.
Everything else (SOC Ticketing, Vendor Access, MySQL decom, status site automation) slots in after WDC foundation unless a specific gating dependency inverts the order.
Summary¶
| Tier | Initiative | Status | Target |
|---|---|---|---|
| T1 | Phase 2 React + Okta (forms frontend SPA) | In Progress | 2026-05 |
| T1 | SQLi tabletop + blue/red drill against forms portal | Planned | 2026-05 |
| T1 | Phase 3 HappyFox integration (forms backend) | Planned | 2026-05 |
| T1 | Forms portal Phase 1.5 — legacy data migration | Planned | 2026-06 |
| T1 | Forms portal Phase 1.6 — ON CONFLICT refactor | Planned | 2026-06 |
| T1 | Okta cleanup — remove localhost redirect URIs | Planned | 2026-05 |
| T2 | Meraki cleanup — Sedita site + Meraki SSO | Planned | 2026-06 |
| T2 | Meraki integration — status/SOC/MkDocs coverage | Planned | 2026-06 |
| T3 | WDC foundation — ESXi inventory & cleanup | Planned | 2026-07 |
| T3 | WDC foundation — Synology NAS inventory & cleanup | Planned | 2026-07 |
| T3 | GCP terraform tree has no VCS — dedicated session: secrets review of tfvars/tfstate, .gitignore design, git init, push to private remote (CSR), import state (supersedes WDC VPN/route drift framing) |
Filed | 2026-06 |
| T3 | VPN cold-start packet loss — diagnosis + fix (Mac → GCP private subnet path drops on cold start; 2nd incident in 7 days) | Filed | 2026-06 |
| T3 | IAP-to-MAPLE 4003 backend-fail — investigate IAP edge or VM-side firewall (alternate access path when direct SSH unavailable) | Filed | 2026-06 |
| T3 | Cloud Scheduler missed-tick on cold-start (sweep.clean missed 05:45 UTC 2026-05-21; investigate scheduler retry config vs worker min-instances tradeoff) | Filed | 2026-06 |
| T4 | SOC Ticketing tab | Planned | 2026-08 |
| T4 | Vendor Access Portal (replace SFTP) | Planned | 2026-08 |
| T5 | Status site automation — Phase A (live Cloud Run list) | Planned | 2026-09 |
| T5 | MySQL decommission | Planned | 2026-09 |
| T5 | Status site automation — Phase B (BigQuery billing export) | Planned | 2026-10 |
| T5 | Status site automation — Phase C (per-service cost attribution) | Planned | 2026-11 |
| T1.7 | Forms portal frontend — client-side validation must block submit | Completed | 2026-05 |
| T1 | α ClamAV scan worker — Commit 1 (Phase 2.5(b.2)) | Completed | 2026-05 |
| T1 | α ClamAV scan worker — Commit 2 (Slack + GCS tag + DLQ + sweep) | Completed | 2026-05 |
| T1 | α ClamAV scan worker — Commit 3 (hardening + hygiene, 8 items) | Filed | 2026-06 |
T1 — Finish forms¶
Phase 2 React + Okta (forms frontend SPA)¶
- Status: Phase 2 frontend SPA: COMPLETE 2026-04-27. Live on
forms.greenpeace.us. Backend revgpus-forms-backend-00033-bfx, frontend revgpus-forms-frontend-00003-74f. 28 forms render with editorial typography. Auth pipeline (Okta OIDC PKCE → JWKS →/api/forms) fully verified end-to-end. - Goal: Ship
gpus-forms-frontendas React SPA with Okta OIDC PKCE auth, callinggpus-forms-backendwith a validated JWT bearer. - Gating: None —
gpus-forms-backendis live on Cloud Run,forms.greenpeace.usresolves, Okta Production cutover complete (2026-04-23). - Deliverables:
forms-frontend/ingpus-infra-portalsrepo- JWT validation middleware on
gpus-forms-backend(and on status / security / soc backends — Phase 2 is org-wide, not forms-only) - Cloud Run deploy + Cloud Build trigger wired
- Asset docs required: Update
iar.mdwith forms-frontend service; no new IR runbook needed (covered by existing portal runbooks); blue-team drill entry for "forged/expired JWT rejected" inblue-team-drills.md. - Phase 2 follow-up: FieldRenderer pulldown lookup — COMPLETE 2026-04-28 (commit
5512d8e). Phase 1's/api/forms/<id>serializer now returnsid/label/pulldown_id(matchingcontract.ts). New endpointGET /api/pulldowns/<name>added withPulldownResponseshape. Backend revgpus-forms-backend-00034-8tn.
Phase 2.5 — Phase 2 backend wire-up¶
- Discovered 2026-04-28: the Phase 2 SPA has been live but Phase 2's three submission endpoints (
POST /api/submissions,POST .../attachments,POST .../submit) are unwired stubs inroutes_phase2.py. They return fake UUIDs and a hardcoded'TBD@greenpeace.us'literal. Submitting a form via the SPA appears to work to the user but persists nothing — no DB write, no GCS upload, no email/HappyFox call. The well-formed routing model exists (actionstable populated by the MySQL migrator with per-formaction_type+destination+template_id) but no submit-time code reads it. There is also no mailer module in the backend at all. - Implementation scope:
- Wire
create_submission(routes_phase2.py:102) to write to thesubmissionstable with KMS envelope encryption for non-searchable fields, returning real UUIDs. Pattern reference:routes/submissions.py:37(Phase 1's working submit handler).- STATUS: COMPLETE 2026-04-30
- Commit 1 SHIPPED 2026-04-29:
07cb75c(auth_v2User.usernameviapreferred_usernameclaim, revgpus-forms-backend-00036-89b). - Commit 2 SHIPPED 2026-04-30:
10bcd0d(design doc tomkdocs-portal/docs/architecture/forms-phase2.5a-design.md). - Commit 3 SHIPPED 2026-04-30:
83f85cb(routes_phase2.pycreate_submissionwire-up, revgpus-forms-backend-00038-5kg). - Verified end-to-end 2026-04-30: API contract response shape, DB persistence (encrypted fields + audit log), MAPLE-side inspection of test submission
0bac9326-4968-477b-8d10-d7d6f457e2a8. - Phase 2 SPA submissions now ACTUALLY persist (was theater since Phase 2 cutover).
- Commit 1 SHIPPED 2026-04-29:
- STATUS: COMPLETE 2026-04-30
- Wire
upload_attachments(routes_phase2.py:128) to GCS bucketgpus-forms-attachmentswith ClamAV scan and attachment row inserts.- STATUS: COMPLETE 2026-05-08 (β closeout — see
architecture/forms-phase2.5b-cleanup-closeout.md)- 2.5(b) handler shipped 2026-05-08:
e17dacb(revgpus-forms-backend-00041-mcf), three-layer verification PASS on test submission999cf0cc-…(GCS object + attachments row + audit_log row, all consistent to the microsecond). - 2.5(b.cleanup) shipped 2026-05-08:
28964c0(revgpus-forms-backend-00043-d9q), 4-source MIME/size truth (prod env vars,Config, module-level shadows,_schema.yamlhint) collapsed toConfig.ATTACHMENT_MAX_BYTES+Config.ATTACHMENT_ALLOWED_MIMEas single source. Production env varsMAX_UPLOAD_BYTES+ALLOWED_MIME_TYPESremoved. - Schema migration
forms-backend/schema/002_add_submission_deleted_action.sqldocumentssubmission_deletedaudit_action enum value (added in production during β step 8 to enable orphan-intent cleanup). - ClamAV scan handled in Phase 2.5(b.2) — see dedicated section below, COMPLETED 2026-05-19. At β time all attachments landed with
clamav_status='pending'; partial indexidx_attachments_clamavwas already in place for the scanner. - Verification gap accepted: end-to-end positive case on rev
00043-d9qblocked by T1.7 SPA bug (silent attachment drop). Pre-cleanup positive cases (5c57b2e6 docx, 62d2fc73 xlsx) plus mechanical-substitution diff correctness accepted as evidence of backend behavior. Rationale in closeout doc.
- 2.5(b) handler shipped 2026-05-08:
- STATUS: COMPLETE 2026-05-08 (β closeout — see
- Wire
finalize_submission(routes_phase2.py:151) to:SELECT actions FROM actions WHERE form_id = <...> ORDER BY action_order- For each action, branch by
action_type:happyfox_template→ render viatemplatestable, POST to HappyFox API (secrets already in Secret Manager:gpus-forms-happyfox-api-key,gpus-forms-happyfox-auth-code)email_template/email_raw→ render template, SMTP send via Postfix on MAPLE (currently no SMTP client in backend)
- Aggregate
routing_resultintosubmissions.routing_resultJSONB.
- Build template render layer (Jinja2 against
templates(id, body)). - Build HappyFox API client (configures the integration that SOC dashboard memory #20 noted as "not configured").
- Decide policy on
email_template/email_rawactions: keep, drop, or redirect through HappyFox? (Per Rajesh: HappyFox API is the operational current path because Google's stricter email policies broke the email-to-HappyFox flow.)
- Wire
- Estimated effort: multi-session work. Sequence likely (a)→(b)→(c) with HappyFox API client built alongside (c).
- Gating: completes the "forms portal actually works" milestone. After Phase 2.5 lands, the SQLi tabletop drill (already queued) is unblocked because submissions actually persist.
Phase 2.5(b.2) — α ClamAV scan worker¶
- Status: Filed 2026-05-19. COMPLETED 2026-05-19 (5 migrations 004-008 shipped — enum + SQL user + table grants + tight scanner RLS + USING widen + sequence USAGE; pipeline verified end-to-end: 07c3c9e0 fixture + 4a0f5bd5 real upload both clean with audit rows; worker rev
gpus-forms-clamav-worker-00002-dwxon Cloud Run, 2Gi, min-instances=0). - What shipped: new Cloud Run service
gpus-forms-clamav-worker(Pub/Sub-push ongpus-forms-attachmentsOBJECT_FINALIZE→ claim →clamscan→ verdict toattachments+audit_log). Scoped Terraform (SA + IAM + topic + DLQ + OIDC push sub; VPN drift untouched, see T3 row). CSR Cloud Build triggers (push + weekly sigrefresh). Design docalpha-clamav-worker.mdv1.2. - α.1 / Commit-2 follow-ups: shipped 2026-05-20 in Commit 2 — see
### Phase 2.5(b.2) — Commit 2immediately below.
Phase 2.5(b.2) — Commit 2 (Slack + GCS tag + DLQ + sweep)¶
- Status: COMPLETED 2026-05-20 (worker rev
gpus-forms-clamav-worker-00010-gr7on:c9c2016; migrations 009 + 010 shipped; Terraform 6 resources applied inclamav-worker.tf; EICAR test verified the full infected pipeline end-to-end at 16:45 UTC). - What shipped:
- Slack-post for infected verdicts — placeholder-tolerant fetch from Secret Manager (
gpus-forms-clamav-slack-webhook), lazy module-global cache, cold-start dance documented; verifiedclass=wiredon first call during EICAR. - GCS quarantine metadata tag (
quarantined=true,quarantine_reason=<sig>,quarantined_at=<iso8601>) — best-effort with try/except per C4 ordering (tag first, Slack second; audit row is the system of record). /dlq-alertroute + DLQ push subscription ongpus-forms-attachment-uploaded-dlq— closes the design §10 gap surfaced at Commit 1 GATE 4. Always returns 200 (no DLQ-of-DLQ)./sweep-stuckroute + Cloud Scheduler tick every 15 min — closes α.1 stuck-scanningrecovery. AtomicUPDATE … RETURNING, one audit row per flipped id withrevision_at_sweepfromK_REVISIONfor deploy correlation, Slack only on flips > 0.- Cloud Scheduler API enabled on
gpus-infra;clamav-scheduler@gpus-infra.iamSA wired withrun.invokeron the worker (resource-scoped) +cloudbuild.builds.editoron the project. - Weekly sigrefresh scheduler job (Sun 02:00 UTC) fires the existing
gpus-forms-clamav-worker-sigrefreshbuild trigger — closes the README's "weekly auto-refresh — NOT yet wired" follow-up from Commit 1.
- Slack-post for infected verdicts — placeholder-tolerant fetch from Secret Manager (
- Verification: EICAR test at 16:45 UTC. Full pipeline green:
event.received→claim.ok→clamscan→INFECTED signature=Eicar-Test-Signature→quarantine.tag.ok→slack.webhook.resolved class=wired(first fetch — lazy init confirmed) → Slack post landed in#us-soc-alerts→ audit rowattachment_scanned_infectedwritten.- All 5 verification checks (worker log, GCS metadata, attachments row, audit_log row, Slack channel visual) passed.
- Bonus signal: the
*/15sweep tick fired naturally at 16:45:03 UTC during the test window, loggingsweep.clean count=0— gate 2d wiring proven live end-to-end without manual stimulus. - Test fixture (1 submission + 1 attachment + 1 audit row + 1 GCS object) cleaned up post-validation; attachments now back at pre-test state (
clean=2, pending=3).
- α.1 / Commit 3 follow-ups (hardening / hygiene — NOT Commit-2 blockers):
- Tighten
cloudbuild.builds.editorto per-trigger IAM on justgpus-forms-clamav-worker-sigrefresh(currently project-wide — over-broad blast radius). - Rename
SLACK_PLACEHOLDER_PREFIX_OK→SLACK_VALID_PREFIX(constant name misleads; logic is correct). - Normalize
_sweep_stuckaudit INSERT to use the_AUDIT_ACTIONdict (currently hardcodes the enum value as a string literal; PG casts implicitly so it works, but inconsistent with_record). - Sigrefresh build trigger needs
--included-files=gpus-forms-clamav-worker/**filter — currently fires on every push tomainregardless of path, racing with the main trigger and burning a build slot. forms-backend/schema/002 ordinal collision (002_rls.sql+002_add_submission_deleted_action.sql) — rename the second to a higher ordinal.- Backfill drain: 3 remaining
pendingattachments (real user uploads pre-dating Commit-1 worker deployment) still need scanning — carried over from Commit 1; safe to drain via the established byte-correct re-fire pattern, OR let/sweep-stuckpick them up if theiruploaded_atis in the stuck window. - Commit the
~/terraform/gpus-infra/terraform/clamav-worker.tfCommit-2 additions to that repo (applied to GCP but the .tf change is local-only).
- Tighten
- T3 candidates filed during this arc (separate from Commit 3):
- VPN cold-start packet loss — 2nd incident in 7 days; routing to private-IP Cloud SQL (10.34.0.0/24, Private Services Access) failed mid-session.
- IAP-to-MAPLE 4003 backend-fail — only matters as a backup access path when direct SSH is also down, but should still resolve.
Phase 2.5(a) design doc correction¶
-
Status: CLOSED 2026-05-07 via
2de57c9. Four corrections applied tomkdocs-portal/docs/architecture/forms-phase2.5a-design.mdto match shipped code (commit83f85cb):_audit_v2helper signature (kwargs + body)- allow-list deny audit action (
"auth_failure"not"submission_denied") SubmissionFieldattachment (session.flush+submission_id, not relationship)- second
_audit_v2call site (consistency fix)
Plus a process-note appendix appended to the doc capturing the read-first-discipline lesson. - Discovered: 2026-04-30 during Phase 2.5(a) Commit 3 implementation. - Context: The design doc at
mkdocs-portal/docs/architecture/forms-phase2.5a-design.mdwas committed in10bcd0dand contained 3 bugs in the helper specs that Code caught before Commit 3 shipped: 1._audit_v2helper signature — design saidactor=actorkwarg; correct isactor_username+ 4 other Phase 1 fields (target_id,target_type,details,request_id,success). 2. Allow-list deny audit action — design said"submission_denied"; that value isn't in theaudit_actionENUM. Correct is"auth_failure"withdetails={"reason": "not_in_allow_list"}andsuccess=False— matches Phase 1's existing pattern. 3.SubmissionFieldattachment pattern — design usedrow.submission = submission(assumes ORM relationship that doesn't exist). Correct is Phase 1'ssession.flush()+submission_id=submission.idpattern.
Forms Phase 2 UX gaps¶
- Discovered: 2026-04-29 PM during Phase 2.5(a) browser regression check.
- Status: T1.5 sub-section is now FULLY CLOSED — items a/b/c/d all closed. Section can stay in the doc as a closed-finding record or be moved to a "Recently completed" archive at section author's discretion. Item (a) CLOSED 2026-05-07 via
f31ec5f. Items (b), (c), (d) CLOSED 2026-04-30 via 2026-04-30 γ phase commits. - Priority order within sub-section: a (pulldown — user-visible blocker for form submission accuracy) → d (font — visible polish) → b, c (cosmetic).
-
Items:
a. Pulldown regression — yes/no booleans — CLOSED 2026-05-07 via
f31ec5f.Root cause: Flask `<string>` route converter (the default) does not match forward slash. Pulldown names containing `/` (e.g. "Grant Funded ? (Yes/No)", "Bargain Type (In/Out)", "Shipping Label/Box Required?") failed to route — Cloud Run URL processing decoded `%2F` back to `/` before route matching, splitting the path into multiple segments that no route consumed. Result: 404 → SPA renders "Could not load options". Fix: change the `@bp.get` decorator from `"/pulldowns/<name>"` to `"/pulldowns/<path:name>"` — single character change. Diagnosis chain used DB inspection (via MAPLE access pattern established 2026-04-30): hypothesis 1 (missing pulldown rows) was falsified by data showing all referenced `pulldown_name`s exist in the `pulldowns` table. Pattern noticed in failing data: all 7 failing names contained `/`.b. COST CENTER duplicate field — CLOSED 2026-04-30 via
6d9231a(resolved as a side effect of theNO_DATA_TYPESfilter — duplicate was a divider/instructions row with that label).c. NOTESDIVIDER label leak — CLOSED 2026-04-30 via
6d9231a(forms-backendroutes/forms.pyfiltersNO_DATA_TYPESfrom form-detail serializer).d. Font darkness / contrast — CLOSED 2026-04-30 via
f5c79e0+e58dabc(forms-frontend.field-labelcolor#4a4a44— WCAG AA at ~9.6:1 contrast over--surface-warm).
Phase 2.1 — Phase 1 Production cutover completion¶
- Goal: Finish the Okta Preview → Production cutover that only partially landed in Phase 1's auth/config layer. Phase 1 was originally built against the Okta Preview tenant; the Production cutover on 2026-04-23 left several stale references and shim layers in place. The 5 items below were surfaced 2026-04-27 during the forms-frontend smoke test and consolidate the remaining cleanup. None are user-visible bugs today; they are maintenance traps.
- Gating: None — Phase 2 SPA is live; cleanup can land at any time.
- Items (priority order):
forms-backend/auth.py— replaceaudience=config.OKTA_AUDIENCEwithaudience=config.OKTA_CLIENT_ID. Remove theOKTA_AUDIENCECloud Run env var workaround. Estimated 1 commit, ~5 min.forms-backend/config.py— changeOKTA_ISSUER = f"https://{os.environ.get('OKTA_DOMAIN', 'greenpeaceeu.oktapreview.com')}"to readOKTA_ISSUERdirectly with defaulthttps://greenpeaceeu.okta.com, matchingauth_v2.py's pattern. RemoveOKTA_DOMAINenv var afterward. Estimated 1 commit, ~10 min.forms-backendsecurity headers — remove CSP/HSTS/X-Frame-Options/etc. from Flask middleware. nginx (forms-frontend/nginx.conf) is now the sole source for response-time security headers. The duplicate headers don't break anything but create maintenance traps. Estimated 1 commit, ~15 min.forms-backendCSP — any CSP that must remain in Flask should remove stalegreenpeaceeu.oktapreview.comreferences. Replace withgreenpeaceeu.okta.com. Stale since the 2026-04-23 Preview→Prod cutover. Estimated 1 commit, ~10 min.- Consolidate
forms-backend/auth.pyandauth_v2.pyinto a single module once both stable. The_key_for_kidretry pattern shipped 2026-04-27 inauth.py(commitd0a7867) carries forward. The v2 module's signin/role pattern carries forward. Defer until after a few weeks of stable production traffic. Estimated 1 PR-sized change.
SQLi tabletop + blue/red drill against forms portal¶
- Goal: Exercise detection and response for SQL injection against the live forms backend before opening it to a wider user base via the HappyFox integration.
- Gating: Phase 2 React SPA live (so the full user-facing attack surface is exercised, not just the API).
- Components: 60-min tabletop → 90-min blue-team detection drill against live endpoint → 90-min red-team adversarial test.
- Deliverables: Updates to
tabletop-playbooks.mdandblue-team-drills.md; Wazuh + Cloud Armor rule tuning if gaps surface; executive summary. - Estimated effort: 4–5h across 1–2 sessions.
- Owner: Rajesh.
Phase 3 HappyFox integration (forms backend)¶
- Goal: Wire form submissions to auto-create HappyFox tickets per form's routing config.
- Gating: SQLi drill complete (don't amplify an undetected injection into the ticketing system).
- Deliverables: HappyFox client in
gpus-forms-backend, retry + DLQ on failed ticket creation, ticket ID written back to submission record, admin UI shows ticket link. - Asset docs required: Update
iar.md(HappyFox as integrated system); extend forms-portal runbook with "HappyFox API outage" and "ticket creation failure → DLQ drain" procedures; blue-team drill entry for "spoofed HappyFox webhook".
Forms portal Phase 1.5 — legacy data migration¶
- Goal: Migrate existing legacy-form submissions (where applicable) into the new schema.
- Gating: Phase 3 HappyFox integration live (so migrated records route correctly on any post-migration edits).
- Scope: One-time batch import; validate row counts, checksum fields, encrypted columns round-trip correctly; retain legacy source read-only for 90 days post-migration.
Forms portal Phase 1.6 — ON CONFLICT refactor¶
- Goal: Refactor upsert logic in
gpus-forms-backendto use properON CONFLICTclauses rather than check-then-insert race conditions. - Gating: Phase 1.5 complete (don't refactor insert paths mid-migration).
- Scope: Submission insert, audit_log append, pulldown cache refresh.
Okta cleanup — remove localhost redirect URIs¶
- Goal: Remove dev-convenience localhost redirect URIs from the Okta Production app (
0oavvg1y33wTWFsmP417) once Cloud Run deploy of forms-frontend verifies. - Gating: Phase 2 Cloud Run deploy verified.
- Done-when: Production app redirect URI list contains only
https://*.greenpeace.us/*entries. Preview tenant (greenpeaceeu.oktapreview.com, client0oadhpjktd5UfCMDm0x7) retained as dev fallback.
T1.6 — Forms portal SOC / observability integration¶
STATUS: NOT STARTED — gap surfaced 2026-05-08 during Phase 2.5(b) scoping discussion
Priority: P2 — should ship after Phase 2.5 functional work completes (2.5b/c/d/e), before SOC Tickets workstream (T4) starts. T4 will assume forms portal events flow into SOC; this workstream makes that true.
Context:
The forms portal (forms.greenpeace.us, gpus-forms-backend, gpus-forms-frontend, Cloud SQL gpus-forms-db) has been operationally invisible to SOC since Phase 1 cutover. While the rest of GPUS infrastructure (WDC servers SKY/RAIN/SUN/WIND, GCP VMs OAK/MAPLE/CEDAR) emits structured events into Wazuh + ELK + Prometheus and is visible across the 15 tabs of soc.greenpeace.us, forms portal emits zero events into that pipeline.
Existing forms-portal capabilities:
- KMS envelope encryption for sensitive form fields (Phase 2.5a)
- Audit log table in Cloud SQL with actor_username, actor_ip, target_id, request_id, success
- IAM-protected Cloud SQL access via service accounts
- Allow-list authorization on form submission
Observability gaps:
- No SOC dashboard tab — the 15 tabs at soc.greenpeace.us don't include "Forms"
- No Wazuh rule coverage for forms-portal events
- No Cloud Logging → CEDAR/MAPLE pipeline for forms events (audit log lives only in Postgres, not exported)
- No Prometheus metrics emitted from forms-backend
- No alert routing for forms-portal anomalies (failed auth bursts, validation errors, malicious upload attempts)
- No IR runbook (rb-00N-forms-*.md) for forms-portal incidents
- No DR procedure for forms-portal in drp.md
- No red/blue drill in tabletop-playbooks.md
Compliance framing:
- PCI DSS: not applicable (no cardholder data flows through forms portal)
- NIST 800-53 / 800-171: applicable as best-practice — control families AC (Access Control), AU (Audit & Accountability), SC (System & Communications Protection), SI (System & Information Integrity). Audit log + KMS encryption partially satisfy AU/SC; AC + SI need observability work.
- MITRE ATT&CK: not a compliance framework, but rest of GPUS infrastructure maps detected events to MITRE techniques. Forms portal events should reach the same taxonomy.
- OWASP ASVS: directly relevant to the SPA + API surface; needs explicit verification pass.
- CIS: applies to host hardening (not Cloud Run services directly); covered for forms portal's underlying platform.
- GPUS internal IRP/DRP framework: forms portal needs runbook + DRP + drill coverage matching what the rest of infrastructure has.
Sub-items (provisional scope — refine when work starts):
a. Audit log → Cloud Logging structured events. Forms-backend audit_log table inserts should also emit structured Cloud Logging entries with proper severity. Pipeline: forms-backend → Cloud Logging → log sink → CEDAR (Elastic) for indexing.
b. Wazuh rule additions for forms-portal events. New rule IDs in the 100020+ range (memory entry on Wazuh ruleset). Cover auth_failure (HIGH severity), attachment_rejected (MEDIUM — potential abuse), submission_created (LOW — informational), attachment_uploaded (LOW). Emit MITRE technique tags where applicable.
c. Prometheus metrics from forms-backend. Standard four (request count, latency, error rate, in-flight) plus domain-specific (submissions_per_minute, auth_failure_rate, upload_size_p99, etc.). Scrape via MAPLE.
d. New Forms tab on soc.greenpeace.us. Tab structure: submission volume, auth-failure rate, attachment activity, slowest queries, recent rejections, threat hunting view (filter audit_log by anomaly patterns).
e. IR runbook rb-006-forms-portal-incident.md. Cover scenarios: compromised submitter account, mass-upload abuse, infected attachment in GCS (forward-look at ClamAV scenario), Cloud SQL unavailability, KMS key rotation, DEK compromise.
f. DR procedure for forms-portal in drp.md. Cover Cloud SQL point-in-time recovery, GCS bucket recovery, Cloud Run rollback pattern, Okta tenant outage fallback.
g. Red/blue drill in tabletop-playbooks.md + blue-team-drills.md. Scenario: malicious attachment uploaded by compromised submitter. Validate detection pipeline end-to-end.
h. OWASP ASVS verification pass. Walk the standard against the forms portal surface; document gaps.
Estimated scope: 3-5 sessions if done as a focused workstream. Could be done incrementally if items a-d are prioritized first (functional observability) and e-h follow (process + verification).
Dependencies:
- Phase 2.5(b)/(c)/(d)/(e) ideally complete first (gives stable surface to instrument)
- Wazuh rule slot range coordination (memory entry ranges)
- Cooperation with T4 SOC Tickets workstream (forms events should auto-ticket)
T1.7 — Forms portal frontend: client-side validation must block submit, not just attachment¶
STATUS: FILED 2026-05-08. COMPLETED 2026-05-14 (commit f175810 shipped 2026-05-12; browser-verified end-to-end 2026-05-14: oversized blocks submit, wrong-MIME blocks submit, clear re-enables, positive case sub 9ac940c8).
Severity: Medium (silent data loss, recipient-visible).
Scope: Frontend only (forms-frontend SPA).
Bug: When client-side validation rejects an attachment (oversized, wrong MIME, or other triggers), the SPA hides the attachment but does not disable the submit button. Submission proceeds to the backend without the attachment. Recipient team sees a complete-looking submission with no attachment and assumes it was intentional.
Known instances (both cleaned up):
d20d2ac8-c19d-48df-a0a4-4f833a750e9b— 2026-05-08 17:15 UTC, csv test, cleaned in β step 8 (audit_log id=10).e93efbc3-4c6e-4b7a-adc9-169d4aedec70— 2026-05-08 18:09 UTC, docx test post-env-var-removal, cleaned in β closeout (audit_log id=12).
Probable triggers (uncharacterized):
- Server-side rejection signal (oversized, wrong MIME)
- Stale
/api/configcache showing the old MIME allowlist (frontend may have cached the pre-cleanup 4-MIME list withtext/csvand withoutdocx/xlsx) - Possibly other (see closeout doc
architecture/forms-phase2.5b-cleanup-closeout.md)
Fix shape (TBD by frontend session):
- Submit button should be disabled while any attachment field has a validation error
- OR submit handler should check for unresolved attachment errors before POSTing to
/submit - OR both
Relationship to other T-priorities:
- Filed below T1.6 (forms portal SOC/observability integration)
- Independent of α (ClamAV) and 2.5(c)/(d)/(e) tracks
- Should be addressed before any meaningful production user testing — silent attachment drop produces forensically-confusing audit trails (submission_created with no attachment_uploaded) and recipient-team confusion
Estimated: 1 session.
T2 — Meraki¶
Meraki cleanup — Sedita site + Meraki SSO¶
- Goal: Close out the Meraki P2 follow-ups identified after the P1 inventory (org 395909, 5 networks, 32 devices — completed 2026-04).
- Gating: T1 forms tier complete.
- Scope:
- Sedita site misconfiguration — resolve (specifics to be confirmed at start-of-work)
- Meraki SSO — currently broken, wire to Okta Production
- Deliverables: Fixes verified end-to-end (Okta login → Meraki dashboard for an admin test user); Sedita site returns to expected operational state.
Meraki integration — status/SOC/MkDocs coverage¶
- Goal: Fold Meraki into the same documentation and monitoring posture as the rest of the estate — Meraki currently lacks matching coverage.
- Gating: Meraki cleanup above complete.
- Deliverables:
- Status site: Meraki org card (device count, online/offline, firmware currency)
- SOC site: Meraki alerts surfaced (security events, config changes, WAN uplink loss)
- MkDocs: new
architecture/meraki-network.mddescribing org, networks, devices, admin model, SSO posture iar.mdentries for Meraki org + each site- Syslog from Meraki → WIND (so Wazuh indexes Meraki events into CEDAR)
- IR runbook:
rb-006-meraki-compromise.md(admin account takeover, rogue config push, AP impersonation) - DR procedure: Meraki config backup/restore procedure added to
drp.md(Meraki backs up config in-cloud, but document how to roll back + how to replace a bricked device) - Red/blue drill: tabletop "Meraki admin credentials leaked" in
tabletop-playbooks.md; blue-team detection drill for "unexpected config change outside change window" inblue-team-drills.md
T3 — WDC foundation¶
ESXi inventory & cleanup¶
- Goal: Bring the ESXi hypervisor — currently undocumented — under the same documentation, monitoring, and IR posture as everything else.
- Gating: T1 + T2 complete.
- Deliverables:
- Inventory: ESXi version, licensing, VMs hosted, networking, hardware health, management plane exposure
- Confirm or sever ESXi ↔ NAS coupling (decision: is the NAS a datastore, a backup target, or both?)
- Cleanup: disable unused accounts, rotate admin credentials, enable syslog → WIND
- Reconfigure: NTP, DNS, timezone, email alerts →
gpus-it-security@greenpeace.org - Monitoring: Prometheus scrape via
vmware_exporter, Grafana dashboard, Wazuh agent on guest VMs where feasible - Status site: ESXi card (version, uptime, VM count, datastore usage)
- SOC site: ESXi in asset coverage map
- MkDocs:
architecture/wdc-hypervisor.md wdc-hostregistry.csventry;iar.mdentry- IR runbook:
rb-007-esxi-compromise.md(hypervisor takeover, guest escape, management plane breach) - DR procedure: ESXi host failure recovery in
drp.md - Red/blue drill: tabletop "ESXi vCenter creds leaked" in
tabletop-playbooks.md; blue-team detection drill "unexpected VM clone / snapshot export" inblue-team-drills.md
- Known risk: ESXi 6.7 is already flagged as EOL in
tracker.md(VLN-004). Inventory may surface the need to accelerate hypervisor replacement — if so, that becomes its own T-tier item.
Synology NAS inventory & cleanup¶
- Goal: Same as ESXi above, for the Synology NAS.
- Gating: ESXi inventory complete (likely coupled — NAS may be serving as an ESXi datastore, which affects cleanup sequencing).
- Deliverables:
- Inventory: model, firmware, volumes, shares, users, backup targets, relationship to ESXi
- Cleanup: disable unused accounts, rotate admin credentials, enable SNMP + syslog → WIND
- Reconfigure: NTP, DNS, timezone, email alerts
- Monitoring: Prometheus scrape via SNMP, Grafana dashboard
- Status site: Synology card (volume health, SMART, firmware)
- SOC site: in asset coverage map
- MkDocs:
architecture/wdc-nas.md wdc-hostregistry.csventry;iar.mdentry (classification, owner, retention, criticality)- IR runbook:
rb-008-nas-compromise.md(ransomware on shares, credential theft, firmware tampering) - DR procedure: NAS failure + volume rebuild in
drp.md - Red/blue drill: tabletop "Synology admin portal exposed" in
tabletop-playbooks.md; blue-team detection drill "mass file encryption on shares" inblue-team-drills.md
GCP terraform tree has no VCS (supersedes "WDC VPN/route TF state drift")¶
- Goal: Initialize version control on
~/terraform/gpus-infra/terraform/(currently untracked on rchhetry's Mac, single point of failure) so terraform changes can be reviewed, rolled back, and "what's the source of truth" has an answer. - Discovered: 2026-05-21, during α ClamAV Commit 2 close-out (A2).
git -C ~/terraform/gpus-infra/terraform statusreturnedfatal: not a git repository, and exhaustive.gitsearch across/Users/rchhetryconfirmed no repo contains these .tf files. The previously-filed "WDC VPN/route Terraform state drift" item assumed a remote repo existed to drift from — the framing was wrong; the no-VCS problem is the parent. - Acute risk mitigation (in effect 2026-05-21):
~/Downloads/terraform-snapshots/2026-05-21/holds md5-verified copies of all 11 .tf files (no tfvars/tfstate/tfplan copied — those need secrets review first). Local-disk redundancy only; not version control. - Inherited drift (still real, blocked until VCS exists):
terraform planagainst live state showsgoogle_compute_vpn_tunnel.wdc_tunnellocal_traffic_selectorforcing replacement;google_compute_route.onprem_mgmt+google_compute_route.onprem_prodmust be replaced as dependents;google_compute_instance.{cedar,maple,openvas}in-place updates. Replacing tunnel + routes would tear down the WDC↔GCP site-to-site VPN (SKY/RAIN DNS-DHCP, SUN Prometheus, WIND ELK). Interim mitigation: all clamav-worker Terraform applied-target-scoped so the drift is never actioned. - Deliverables for the dedicated session:
- Secrets review of
terraform.tfvars+terraform.tfstate*+tfplan: enumerate values, identify what must NOT enter VCS. - Design
.gitignore: minimumterraform.tfvars,*.tfstate*,tfplan,.terraform/, plus anything from secrets review. git initin~/terraform/gpus-infra/terraform/; first commit of the 11 .tf files (and any safe-to-commit ancillary files).- Decide remote: Cloud Source Repositories (matches existing convention for
gpus-infra-portals) vs private GitHub. - Push initial commit; document the remote in the worker README.
- With VCS in place: drift triage — root-cause the VPN tunnel
local_traffic_selectormismatch, decide reconcile direction (update Terraform to match live, OR plan a maintenance-window apply that recreates tunnel/routes), if recreation: scheduled change window with WDC-connectivity-loss comms. - Remove the
-targetworkaround note from the clamav-worker README once unscoped apply is safe.
- Secrets review of
- Gating: independent of WDC inventory work; should run before any further unscoped
terraform apply. Best-suited to a dedicated session — touching secrets + remote setup + drift reconcile + maintenance window planning needs full attention.
T4 — Queued¶
SOC Ticketing tab¶
- Goal: Replace ad-hoc alert triage with tracked tickets on
soc.greenpeace.us. - Gating: T3 complete (stable asset inventory before we wire ticketing to it).
- Sources: Wazuh (level ≥ 10), Prometheus alertmanager, AIDE change alerts, Fail2ban bans, OpenVAS critical/high.
- Dedup: 5 min window on (source, rule_id, host).
- SLA: Critical = 15 min ack / 4 hr resolve · High = 1 hr ack / 24 hr resolve · breach → Slack
#soc-alerts+ email. - Asset docs required:
iar.mdupdate; IR runbookrb-009-soc-ticketing-outage.md; blue-team drill for "silent alert drop" (ingestion pipeline broken but tickets still showing green).
Vendor Access Portal (replace SFTP)¶
- Goal: Zero-trust replacement for the current SFTP vendor drop.
- Gating: SOC Ticketing in place (so vendor-portal anomalies ticket correctly from day one).
- Controls: Vendor IP whitelist via Cloud Armor, signed expiring URLs (max 72h), full audit trail, per-vendor bucket prefixes, ClamAV scan before internal consumption.
- Asset docs required:
iar.md; IR runbookrb-010-vendor-portal-abuse.md(stolen signed URL, vendor account compromise); blue-team drill for "vendor credential used from unexpected geography".
T5 — Backlog¶
Status site automation — Phase A¶
- Goal: Eliminate hardcoded values in
status-site/index.html. - Scope: Cloud Run service list via
gcloud run services listat render time; server count viaservers.py; cost block labelled "last updated YYYY-MM-DD" (still manual this phase, but honest about staleness). - Estimated effort: 2h.
MySQL decommission¶
- Goal: Retire the legacy MySQL instance. Remaining dependencies to be confirmed during inventory.
- Gating: Legacy migration path confirmed (see Forms Phase 1.5 outcome — likely overlap).
- Asset docs required:
iar.mdremoval;drp.mdupdate to drop MySQL recovery procedure; final backup captured and sealed in Coldline GCS with 7yr retention lock before shutdown.
Status site automation — Phase B¶
- BigQuery billing export, live current-month spend, trend, forecast.
- Requires GPI budget approval for BigQuery storage + query cost (est. <$5/mo).
Status site automation — Phase C¶
- Per-service cost attribution, budget alerts.
- Depends on Phase B (BigQuery export) landing.
Cross-cutting / lessons learned¶
Cloud Run env vars + Cloud Build deploys¶
Cloud Run env vars set out-of-band via gcloud run services update --update-env-vars are wiped on every Cloud Build deploy IF the cloudbuild.yaml uses --set-env-vars (destructive replace) instead of --update-env-vars (merge). Discovered 2026-04-27 in forms-backend/cloudbuild.yaml — fixed in commit 1ca4461. Other three backends (status, soc, security) don't have this bug because their cloudbuild.yaml files don't pass any env-var flag.
Lesson: any new backend cloudbuild.yaml should either omit env-var flags entirely (preserve) or use --update-env-vars (merge). Never --set-env-vars unless the deploy is intentionally the source of truth for ALL env vars.
Cloud SQL access for developer-side diagnostics¶
gpus-forms-db is private-IP only (10.34.0.3). Reaching it requires presence inside the gpus-infra VPC. Tested 2026-04-28:
- Laptop direct: blocked (no VPN to forms-db's service-peering range)
- Cloud Shell +
cloud-sql-python-connector: blocked (timeout to private IP from Cloud Shell's managed network) - Cloud Shell +
cloud-sql-proxy --private-ip: blocked (proxy bound locally fine but the dial to10.34.0.3:3307timed out)
Workable paths for developer-side diagnostic queries:
- SSH into MAPLE/OAK/CEDAR (all in
gpus-infraVPC), run script there. Caveat: that VM's service account principal needs Postgres-side grants (CLOUD_IAM_USER+SELECT). - Add VPC peering between Cloud Shell's project network and
gpus-infra(administrative work, deferred).
Lesson: Phase 2.5 implementation work needs MAPLE-based or peered DB access established as a prerequisite. inspect_actions.py (forms-backend/migrate/, committed 9eecc35) is ready to run from any in-VPC environment.
Memory entries describing "current bugs" age fast¶
Diagnostic memos written during one session describe state at that moment, not current state. The 2026-04-22 count-drift memo described a real bug in auth.py — fixed one day later in commit dd810e0 — but the memo persisted in memory and led to a Phase 2.5(a) design draft that proposed re-fixing the already-fixed bug. Code surfaced the staleness by reading current auth.py before any edits.
Lesson: Read-first discipline includes reading current code, not just current memory. When a memory entry describes a bug, verify the bug still exists by reading the affected module's current state (and grep recent commits for likely fix language).
Design doc accuracy under read-first discipline (2026-04-30)¶
Three design-doc bugs were caught by Code's "STOP and tell me if anything doesn't fit" gate before Commit 3 shipped:
_audithelper signature mismatch with actualAuditLogmodel- Invented
audit_actionenum value not in schema - Assumed SQLAlchemy relationship that wasn't declared
All three would have crashed the handler at runtime (or worse, the third would have silently produced submissions with NO field rows).
Lesson: design docs that "mirror Phase 1 patterns" must be drafted from current reads of those patterns, not from memory of earlier reads. The three failures had a common root: the doc described what was remembered of Phase 1's behavior, not what Phase 1 actually does at the line number the design doc claims to mirror. Verifying the source pattern at draft time would have caught all three.
Read-first discipline doesn't end at "read the file once during investigation." It applies again at every implementation moment that references that file's content.
MAPLE access for ad-hoc DB queries (2026-04-30)¶
Established and verified working pattern for ad-hoc Cloud SQL inspection from MAPLE (Phase 2.5 implementation work depends on this for diagnostic queries):
- SSH user:
cloudadmin(NOTmonitadmin, which is SUN/WIND only). - MAPLE has
cloud-sql-proxyv2 +psqlpre-installed. - MAPLE does NOT have
python3.11orgit— Python connector path requiressudo dnf install. - Postgres user
maple-agent@gpus-infra.iamexists with SELECT grants onsubmissions,submission_fields,audit_log(no GRANT needed).
Standard one-liner:
ssh cloudadmin@maple "cloud-sql-proxy --auto-iam-authn --private-ip \
gpus-infra:us-central1:gpus-forms-db &" && sleep 5 && \
psql "host=127.0.0.1 port=5432 dbname=gpus_forms user=maple-agent@gpus-infra.iam sslmode=disable" \
-c "<query>"
inspect_actions.py at forms-backend/migrate/ can run from MAPLE only if python3.11 + git are installed first. Defer Python install until there's a real need beyond what proxy + psql handles.
Hypothesis falsification via empirical data (2026-05-07)¶
α phase pulldown regression demonstrated a clean hypothesis-disconfirmation chain:
- Symptom: yes/no pulldowns failing "Could not load options"
- Initial hypothesis: missing pulldown rows in DB
- Hypothesis falsified: DB query showed all 7 failing
pulldown_names exist inpulldownstable with proper["Yes", "No"]values - Pattern in failing data: all 7 names contained
/ - New hypothesis: Flask string converter doesn't match
/; URL decoding splits the path - Fix: change
<name>to<path:name>in route decorator - Verified: dropdown opens with Yes/No options
Lesson: when a fix idea seems obvious ("add the missing pulldown rows"), check the data first. The obvious fix shipped without the falsification step would have been a no-op (rows already exist) and the bug would persist with diagnostic time wasted plus user trust diminished.
This pattern is reusable: when an outage looks like "data is missing," verify by query before shipping the inverse ("add the data"). The reverse pattern — when data exists but isn't being read — points at a different layer (routing, auth, encoding, serialization).
Completed¶
| Initiative | Completed | Notes |
|---|---|---|
| Forms Portal Phase 2.5(b) — attachment upload wire-up + cleanup | 2026-05-08 | β closed with verification gap acknowledged (T1.7). Commits e17dacb (handler) + 28964c0 (cleanup). Migration 002 documents submission_deleted enum addition. See architecture/forms-phase2.5b-cleanup-closeout.md. |
| Okta Production cutover | 2026-04-23 | Production tenant live; group-based assignment; Preview kept as dev fallback |
| forms.greenpeace.us DNS + TLS | 2026-04-21 | CNAME → ghs.googlehosted.com; managed cert issued |
| Forms Portal Phase 1 (backend) | 2026-04-20 | Cloud SQL PG15, CMEK, IAM auth, AES-256-GCM envelope, RLS 4 roles |
| Meraki P1 inventory | 2026-04 | Org 395909, 5 networks, 32 devices |
| Portal backup cron on SKY | 2026-03 | gpus-portal-backup.sh nightly 02:30 → GCS |
| Okta Preview SSO across 4 portals | 2026-03 | OIDC PKCE, shared gpus-okta-auth.js |
Change log¶
| Version | Date | Author | Change |
|---|---|---|---|
| v1.13 | 2026-05-21 | R. Chhetry / Claude | α ClamAV close-out housekeeping. Sigrefresh build trigger now filters by gpus-forms-clamav-worker/** (A0) — fixes the double-build-on-every-push issue. Backfill drain complete (A1) — 3 fixture attachments (PNG/DOCX/XLSX, all rchhetry β-phase test files from 2026-05-08) re-fired via synthetic Pub/Sub publish, all clean, GCS bytes unchanged, 3 new audit_scanned_clean rows; attachments table now clean=5, pending=0. Design doc alpha-clamav-worker.md → v1.3 (§6 sweep reframed required, §10 DLQ subscription clarified, new §13a 7-point Cloud SQL access spec for new DB-using services). Commit 3 hardening row added to summary. Four new T3 candidates filed: VPN cold-start packet loss, IAP-to-MAPLE 4003 backend-fail, Cloud Scheduler missed-tick on cold-start, and "GCP terraform tree has no VCS" (the no-VCS finding supersedes the previously-filed "WDC VPN/route Terraform state drift" — drift framing was wrong; no remote ever existed to drift from). A2 deferred per the no-VCS surprise; tonight's mitigation is a local snapshot of the 11 .tf files. |
| v1.12 | 2026-05-20 | R. Chhetry / Claude | α ClamAV Commit 2 COMPLETE. Slack alerts + GCS quarantine tag + /dlq-alert + /sweep-stuck + Cloud Scheduler wiring shipped. EICAR test verified full infected pipeline end-to-end to #us-soc-alerts. Worker rev 00010-gr7. Migrations 009 + 010 (audit_action enum + scanner UPDATE/INSERT WITH CHECK widen). Terraform 6 resources applied (DLQ sub + scheduler SA + 2 scheduler jobs). 8 hardening/hygiene items filed for Commit 3. |
| v1.11 | 2026-05-19 | R. Chhetry / Claude | α ClamAV Commit 1 COMPLETE. Migrations 004-008 shipped (enum + IAM user + grants + RLS + sequence USAGE). Pipeline verified end-to-end on T1.7 fixture (07c3c9e0) and real upload (4a0f5bd5). Worker gpus-forms-clamav-worker live (rev 00002-dwx, 2Gi). α.1 follow-ups: stuck-scanning sweep, DLQ subscription + alert, backfill drain of 3 remaining pending rows. |
| v1.10 | 2026-05-14 | R. Chhetry | T1.7 forms portal frontend silent-attachment-drop bug closed. |
| v1.9 | 2026-05-08 | R. Chhetry | β phase: Phase 2.5(b) attachment upload + 2.5(b.cleanup) MIME/size truth consolidation COMPLETE. Three-layer verification PASS on 999cf0cc-…; 4-source-of-truth divergence collapsed to Config.ATTACHMENT_*; production env vars removed; schema migration 002 documents submission_deleted audit_action enum addition. Two orphan-intent submissions cleaned up (d20d2ac8, e93efbc3). New T1.7 filed: frontend silent-attachment-drop bug surfaced during β verification — SPA submit must be gated on attachment validation state. Verification gap on rev 00043-d9q acknowledged in closeout doc. |
| v1.8 | 2026-05-08 | R. Chhetry | New T1.6 workstream filed: forms portal SOC/observability integration. Gap surfaced during Phase 2.5(b) scoping — forms portal has been operationally invisible to SOC since cutover. 8 sub-items spanning logging/metrics/Wazuh/SOC tab/runbook/DRP/drill/ASVS. Sequenced after Phase 2.5 functional work, before T4 SOC Tickets. |
| v1.7 | 2026-05-07 | R. Chhetry | γ phase: design doc correction shipped (2de57c9) — forms-phase2.5a-design.md now matches shipped code. α phase: pulldown regression for yes/no booleans CLOSED (f31ec5f) — Flask route converter fix. T1.5 sub-section now fully closed (all 4 items a/b/c/d). New cross-cutting lesson on hypothesis falsification via empirical data added. |
| v1.6 | 2026-04-30 | R. Chhetry | Phase 2.5(a) COMPLETE — 3 commits (07cb75c, 10bcd0d, 83f85cb), submissions now persist end-to-end (verified API + DB + audit). γ phase shipped same day: font darkness (f5c79e0+e58dabc), NO_DATA_TYPES filter (6d9231a). T1.5 items b/c/d CLOSED; item a (pulldown regression) remains. Cross-cutting lessons added: design doc accuracy, MAPLE access pattern. |
| v1.5 | 2026-04-29 | R. Chhetry | Phase 2.5(a) Commit 1 SHIPPED (07cb75c, auth_v2 User.username via preferred_username). Commits 2+3 paused at design-complete state. New T1.5 sub-section "Forms Phase 2 UX gaps" added covering pulldown regression (yes/no booleans), COST CENTER duplicate, NOTESDIVIDER label leak, and font contrast. |
| v1.4 | 2026-04-28 | R. Chhetry | FieldRenderer pulldown bug FIXED (commit 5512d8e). Phase 2.5 "Phase 2 backend wire-up" promoted as T1 sub-section after discovery that Phase 2 submission stubs persist nothing. Cross-cutting lesson on Cloud SQL access added. |
| v1.3 | 2026-04-27 | R. Chhetry | Forms Phase 2 frontend cutover complete. Phase 2.1 cleanup queue (5 items) added under T1. FieldRenderer pulldown bug added. Cross-cutting Cloud Run env-var lesson documented. |
| v1.1 | 2026-04-24 | R. Chhetry | Re-sequenced: forms (T1) → Meraki (T2) → WDC foundation (T3). WDC items no longer compete with forms momentum. |
| v1.0 | 2026-04-24 | R. Chhetry | Initial draft — consolidated priorities from memory + recent session notes |