Information flows
ADMINISTRATOR ::: danger RestrictedInternal pipeline documentation. :::
End-to-end data flows for the major operations. Every flow is deterministic-first, AI-advisory-second.
Flow 1 — Vessel onboarding (natural sequence)
The authoritative onboarding sequence per project_tracked_matters_spec:
Add vessel
→ Particulars upload (Ship's Particulars PDF)
→ general_particulars extractor
→ writes vessel_particulars row
→ writes vessel_particulars_provenance entries (extracted)
→ IMO + DWT (Summer Salt Water) propagate to vessels table on approval
→ GA Plan upload
→ general_arrangement extractor
→ writes vessel_equipment_inventory (nested JSON in vessel_particulars.json_blob)
→ writes vessel_tanks
→ writes vessel_particulars_provenance entries
→ Sister-vessel detection (matches on dimensions + flag + class)
→ CL Skeleton wizard (12-step Windows-installer-style)
→ blocked unless 6 mandatory docs present (Particulars · Capacity Plan · GA Plan · Form E/SER · GMDSS · ECDIS)
→ emits per-group XLSX (Groups 2–8)
→ Helper-doc agent (B1 milestone — agent-only, not in UI)
→ Canonical import (POST /api/vessels/:id/components/import, 207 Multi-Status)
→ Dashboard liveDelete-ship preserves RAG
Deleting a vessel removes vessels row + assignments, but preserves rag_chunks and cl_knowledge_base entries. Only the administrator can delete; the operation is audit-logged.
Flow 2 — Source-document upload (wizard)
Operator drops file in Class & Flag Documents wizard
→ POST /api/vessels/:vesselId/source-documents
→ Anti-misuse hardening:
→ Workers AI filename classifier (Llama-4-Scout)
→ magic-byte sniff (catches .jpeg→.pdf spoof and CERT→GA rename)
→ If a prior evidence exists for the same (vessel_id, wizard_slot):
→ archiveOnSupersede(prior) writes source_documents_archive row
with superseded_by_source_document_id pointing at the new doc
→ UPDATE source_documents (idempotent on (vessel_id, wizard_slot))
→ On UPDATE failure: rollback the archive insert (no orphan archive)
→ Receipt response includes:
action ∈ {created, replaced}
wizard_slot
g1_ucs_code
superseded_source_document_id (when applicable)
→ ai-status messenger thread on supersede event (Sofia 23-08 quiet-hours guard)Flow 3 — Run 1 / Run 2 separation
Run 1 — Group 1 cert population. Cert filename prefix → G1 row. Cert-level metadata only (validity date, surveyor, certificate number). Equipment details extracted from inside the cert are routed to vessel_equipment_inventory, NOT the G1 row.
Run 2 — Groups 2–8 component population. CL Skeleton Builder reads:
vessel_particulars + vessel_equipment_inventory + vessel_tanks + Master List
→ emits per-group XLSXTech Detail injection priority (highest first):
- Vessel CL "Other Detail" column
equipment_inventorykeyword longest-matchtank_inventorykeyword- Empty
Flow 4 — UCS Foundation cascade (A4)
The cascade is what keeps every dependent table consistent when a new UCS Master version is activated.
Affected stores when a Master code changes:
ucs_master_list → renamed | moved | split | merged | deleted | new
↓
vessel_components.code
source_documents.g1_ucs_code
rag_chunks.code (in RAG_DB binding)
cl_builds.code references
cl_knowledge_base.target_code
vessel_particulars_provenance.field_path (when path includes a code)
pms_jobs.code
code_history (audit trail — backwards-compatible reads)Two-step UX (v2.31.0.34/.35):
Step 1 — POST /api/admin/ucs-foundation/cascade-preview
body: { fromVersionId?, toVersionId, dryRun: true (default) }
→ reads-only diff classifies each code as
kept | renamed | moved | split | merged | deleted | new
→ returns per-table impact counts
→ returns sample rows that would be touched
Step 2 — POST /api/admin/ucs-foundation/cascade-apply
body: { toVersionId, dryRun: false, confirmApply: true }
→ guards:
toVersionId must be is_active=1
dryRun:false REQUIRES confirmApply:true
→ atomic D1 batch write
→ quarantine guard: ambiguous splits routed to operator quarantine pick
→ audit_log entries for every row touched (migration 0104)Backwards-compatible reads
The code_history table preserves old → new code mappings. Reads against old codes return the new row, with a code_history_via=N annotation. This means RAG chunks for old codes still resolve until the next rag-cascade pass (F-5 milestone) rewrites them.
Flow 5 — KB orphan heal
Trigger: POST /api/admin/kb-orphan-heal { dryRun? }
→ SELECT cl_knowledge_base WHERE quarantined=1
→ For each orphan row:
→ SELECT candidates FROM ucs_master_list
WHERE component_name LIKE '%<token>%'
AND version_id IN (SELECT id FROM ucs_foundation_versions WHERE is_active=1)
→ Score each candidate with Jaccard(orphan.target_code_text, candidate.component_name)
→ If best ≥ 0.72:
UPDATE cl_knowledge_base SET target_code = <new>, quarantined = 0
push to sampleHeals[] (cap 10)
Else:
push to sampleUnhealed[] (cap 10)
→ On D1 error:
push to errorSamples[] (cap 10) with raw e?.message
→ Returns { scanned, healed, unhealed, errors, sampleHeals, sampleUnhealed, errorSamples }
→ Posts ai-status messenger thread on completion (no dedupe — admin-triggered)Flow 6 — RAG eval cron
02:00 UTC daily
→ Run a fixed eval set against retrieveHybrid()
→ Compute recall@5 vs rolling 7-day baseline
→ If regression ≥ 5pp:
→ SELECT FROM rag_eval_sentinel WHERE date = today (per-UTC-day dedupe)
→ If no sentinel:
INSERT sentinel row
enqueue Postmark email
createSystemNotification('ai-status', ...)
→ If SELECT errors: notify anyway, log [cron:rag-eval]Flow 7 — Backup & restore
03:00 UTC daily
→ wrangler d1 export → R2 bucket pms (binding BACKUPS) at backups/pms-db-YYYY-MM-DD.sql
→ Vectorize export → R2 at backups/vec-YYYY-MM-DD.ndjson
→ 30-day retention sweep at 04:00 UTC
Restore:
→ POST /api/admin/backups/restore { snapshotKey }
→ reads R2 object → applies via wrangler d1 execute --remote
→ audit_log row: restore_backupFlow 8 — In-app messenger + email mirror
Source event (e.g. supersede, heal complete, RAG eval regression, draft submitted)
→ createSystemNotification(env, db, { subject, body, recipients, domain })
→ INSERT into message_threads + messages + message_recipients
→ Quiet hours guard: 23:00–08:00 Europe/Sofia
→ suppressed message logged with quiet_hours_skipped: true
→ morning digest at 08:00 sweeps the queue and dispatches
→ enqueueEmailMirror(threadId)
→ resolveRecipientEmails(recipients)
→ Postmark Send Email API (50-recipient batching, sandbox-aware 412 ACK)
→ From: noreply@pmsplanner.com (verified SPF + DKIM + Return-Path)
Domains in use:
notification — generic
upload-request — supervisor asks superintendent for a doc
audit — compliance events
general — open thread
ai-status — automatic AI-pipeline eventsFlow 9 — Draft → approve
Superintendent: creates row → status = 'draft'
↓
submits → status = 'pending_approval'
↓
Supervisor or Administrator:
approves → status = 'approved' → live in PMS
rejects → status = 'rejected' → returns to superintendent with reason
Batch approve (supervisor): selects N pending rows of one entity type
→ atomic D1 batch UPDATE
→ audit_log entries per row
→ messenger thread to the originating superintendentCross-cutting hooks
- Cron (
*/10 * * * *): classifier sweep + 03:00 UTC daily backup + 04:00 UTC retention sweep + hourly session GC + 02:00 UTC RAG eval + watchdogs - Login probe (post-deploy):
/api/auth/loginwithadmin / Spb812— exit 3 on failure /api/health(50-byte JSON): public uptime probe; carriesversion,classifierVersion,runnerVersion- Migration sentinel: every migration writes a sentinel row in
kv_stateso re-runs skip;db:migrateis idempotent