VERIFIABLE ENTERPRISE-AI PROOF

The 10-week engagement that paid for the entire platform.

A venture-stage commodity trading and advisory firm operating across multiple regulatory jurisdictions hired a single director and an agent roster. In 10 weeks they shipped 8 distinct projects, including a 40,000+ line production Regulatory Intelligence Platform with hybrid RAG, three-tier model orchestration, and WORM-compliant audit trails. The work product is real. The audit trail is reviewable. This is what the buyer of an enterprise-AI platform actually needs to see.

Book a Technical Walkthrough See the audit trail →

AT A GLANCE

Single director. 9 specialist agents. 10 weeks.

Week
Engagement

Projects
Delivered

40K+

LOC Production
Python

Domain Agents
Deployed

Regulatory
Documents

389

Captured
Sessions

Human-team baseline

Tech Lead + 2 Senior Backend Engineers + Data Engineer + AI/ML Engineer + Regulatory Analyst + Compliance Officer + QA Engineer + Project Manager.

8-person consulting team · 4–6 months elapsed · ~$300,000–$500,000 all-in cost equivalent.

Agentic delivery

Single director. 9 specialist agents (regulatory expert · portfolio proxy · risk manager · compliance/legal · AI governance · data engineer · technical architect · QA validator · project manager) deployed as code.

~250+ director-hours over 10 weeks · API + tooling $2,000–$5,000 for the entire engagement.

THE FIRST FIVE QUESTIONS

What a Head of Engineering will ask in a first call. Pre-answered.

Detailed audit trails, methodology, and source code are reviewable by qualified evaluators under NDA. The headlines are below.

How many of the 40K+ LOC are agent-generated vs human-written?

Detailed human/agent split is reviewable under NDA with the engagement audit trail. Headline: every commit went through HITL gates (Triage Accept → Plan Approval → Verification Accept) before merge. The gating is non-skippable - agent-authored code that did not pass review did not ship. Defect-rate-on-agent-generated and review-time-per-shift figures are part of the same audit trail.

Which agents shipped which projects?

See "Project-by-project breakdown" below. Each project lists the specialist agents that contributed. No advisor-only / no-output agents are listed - the roster reflects builders who shipped code or schemas under HITL approval. Agent definitions are YAML files in vault - diff-able and rollback-able.

What is the human-supervision ratio?

Single director + agent teams across 5+ specialist roles per project. Approximately 250+ director-hours over the 10-week engagement (HITL approval, brief authoring, edge-case escalation, final review). The director was not writing the code - they were directing the agents. API + tooling cost in the $2,000–$5,000 range for the entire engagement.

How is this auditable / verifiable?

Three layers. (1) Git diff URLs for every project - agent-authored commits carry agent attribution metadata. (2) Vault session telemetry: 389 captured operational sessions including agent prompts, gate decisions, and review notes. (3) Methodology: every project records human-team baseline at task close and writes to an efficiency database - the multiplier numbers are computed, not estimated. Available to qualified evaluators under NDA.

Why is this not just "AI-assisted coding" by another name?

AI-assisted coding accelerates an individual engineer at the keystroke level. PlexiFlexor staffs specialist roles. Across the 10-week engagement, 9 distinct agent identities (regulatory expert, portfolio proxy, risk manager, AI governance, data engineer, technical architect, QA validator, project manager, compliance/legal) operated as the team - each with its own vault context, voice, and approval surface. The director coordinated specialists; the specialists were agents. The closest equivalent is hiring a 6–8 person consulting team, except they live in YAML.

PROJECT-BY-PROJECT

8 projects. Which agents shipped what.

Regulatory Intelligence Platform

4–6 wk ~40,000 LOC

Production-grade regulatory intelligence platform with hybrid RAG (dense embeddings + sparse retrieval + reranking), three-tier model orchestration, and WORM-compliant audit trails (7-year retention).

Output: Python · ~10 microservices · 6 data-source connectors · state-machine orchestration · HITL gates on material-exposure decisions.

Specialist agents: regulatory-expertcompliance-legaldata-engineerqa-validator

Domain Knowledge Base

2–3 wk

Curated regulatory document corpus across multiple jurisdictions with overlap reasoning, jurisdiction conflicts surfaced, and citation provenance preserved.

Output: ~40 curated regulatory documents · 5 market verticals · jurisdiction-conflict ontology.

Specialist agents: regulatory-expertcompliance-legalarchivarius

Specialist Agent Roster

2 wk

Curated set of specialist agents tuned for trading & advisory operations: regulatory expert, portfolio proxy, risk manager, compliance/legal, AI governance, data engineer, technical architect, QA validator, project manager.

Output: 9 domain-specialized agents · YAML-defined · vault-versioned · simulation-harness validated before production.

Specialist agents: ai-ml-architectproject-managerqa-validator

Network Intelligence Platform

1–2 wk

Counterparty graph + network intelligence over the firm's active engagements: who knows whom, what conversations are warm, where adjacency creates opportunity.

Output: Graph schema · ingestion from CRM + comms · weekly cockpit with deal-velocity flags.

Specialist agents: data-engineeranalyst

Integrated Conversational Platform

1–2 wk

Authenticated conversational interface for portfolio queries, risk introspection, and regulatory lookups - backed by the firm's data substrate, not a third-party LLM with raw fund data.

Output: Web UI + API · session-scoped auth · all queries audit-logged · grounded retrieval.

Specialist agents: ai-ml-architectdata-engineer

P6–P8

Horizon-1 / Horizon-2 platform builds + 19+ initiative specifications

~3 wk

Horizon-1 platform components shipped to production. Horizon-2 designs documented with PRDs, schemas, and cost models. Plus 19+ documented initiative specifications across compliance monitoring, AML-KYC automation, edge analysis, capacity estimation, and conversational investment intelligence.

Output: 19+ initiative specs · production cutover plan · 100 events / 82 task completions / 9 repos active in first month.

Specialist agents: project-managerbusiness-analystsolutions-architect

VERIFICATION TRAIL

Don't take the numbers - verify them.

What separates this case study from a competitor brochure: the audit trail exists, and qualified evaluators can read it. We will walk you through commit-by-commit on the technical call.

Git diff URLs

Every agent-authored commit carries agent attribution and HITL gate timestamps. Buyer-grade verification: pick a commit, read the diff, see which agent authored, see which director approved, see which gate captured.

389 captured operational sessions

Session-level telemetry including prompts, gate decisions, review notes, and rejection reasons. The audit trail is complete; redacted exports available under NDA.

Self-rewrote dogfood loop (this site)

plexiflexor.ai itself was rewritten by the platform it sells. April 27, 2026: an internal integrity-auditor advisor flagged 24 issues across the portfolio. April 29: 12 had shipped to UAT under HITL approval. The diffs are in the repo. Same loop. Same gates. Same agent roster.

~40 curated regulatory documents · 5 market verticals

Domain Knowledge Base content with citation provenance, jurisdiction-conflict resolution, and overlap reasoning. Available for qualified evaluator inspection under NDA.

Book a 30-min walkthrough.

We will pick three commits at random from the engagement, walk you through the diffs, the agent that authored each, the gate that approved each, and the cost recorded for each. If anything is hand-waved, the call ends. That's the offer.

Request the Walkthrough