— The Architect
Ultimate
AI Command Room
ProxAI runs a fleet of AI agents that create Swarm Intelligence.
Leadership gets full control over the company's AI integration and takes action with the help of this fleet.

Real Leadership Power
Engineers are generating 10x more output because of AI.
In reality, only 1.4x more actual work gets done.
Close the gap with ProxAI agent fleet!
A heavy review and integration day.
Today's Build Focus
- Multi-Agent Plan Reviews: Deployed parallel consensus panels — up to 8 specialized agents — to synthesize feedback on auth guards, backend schemas, and code impacts. Systematically resolved blockers.
- Claude Desktop Integration: Investigated telemetry logs for the new desktop app. No new parser needed, but a Prisma schema migration was mandatory to avoid Postgres constraint violations.
- Reparse Scripts & Integration Tests: Finalized the version drift workflow — built SQL and BullMQ re-trigger scripts, fixed all 11 integration test failures, and hit 100% strict coverage on
reparse-chats.ts.
AI Usage Summary
Drew used the AI as a rigorous architectural sounding board — parallel reviewers to bulletproof plans, alongside uncovering database edge cases and driving test suites to green.
The Infra team closed 7 of 9 sprint items — one carryover (caching layer migration) and one descoped (legacy log aggregation).
Sprint Outcomes
- Pager Volume: Dropped 22% week-over-week, mostly from new auto-mute rules on transient deploy errors and clearer off-hours routing.
- Release Pipeline: Two release candidates queued for Monday staging. Friday-cut policy holding — no hotfixes this week.
- Observability: OpenTelemetry traces live on billing; p95 latency visibility improved across 4 dependencies.
Capacity Notes
Maya was out Tuesday–Thursday for a conference. Brennan covered her on-call; the team kept its commitment by reprioritizing one stretch goal.
Risks for Next Week
- Caching layer migration may slip again without Redis vendor support.
- Open question: can the sprint absorb the optional security training before the audit deadline?
AI Usage Summary
The team leaned on AI for incident postmortems and runbook drafts. AI-assisted authorship hit 47% of new internal docs.
Suggested Claude Code skills:
superpowers— full-cycle dev workflow (brainstorm → plan → TDD → review). Top community install at 94k+ stars.frontend-design— design system + aesthetic philosophy. Fits Sam's recent UI commits.testing-practices— what to test and how. Sam's last 3 PRs landed without unit tests on new modules.
One-evening install + walkthrough closes the surfaced gaps.
Scanned every user's ~/.claude/CLAUDE.md and auto-memory for patterns showing up across the org.
Promoted to shared .claude/rules/:
- "Use
bun, notnpm" — surfaced in 7 users' personal memories. Already convention; codified to stop CI surprises. - "Run
bun run validatebefore opening a PR" — 4 users wrote this to auto-memory after the same CI failure. - "Service files start with
'use server'" — 3 users' auto-memory captured this from blame. Promoted to a rule.
Archived: 2 stale entries on the old npm install workflow (deprecated last quarter).
Net: org memory is 4% smaller and 12% more accurate against the eval suite.
A heavy refactoring day focused on deep architectural cleanup.
Today's Build Focus
- AI Processor Cleanup: Removed legacy AI processor configs, navigated cross-repo permission guards to update the Prisma schema, and bumped package versions to 7.11.4.
- Version Drift Workflow: Investigated deferred clusters in the live parser registry, planned gateway fixes, and audited data migrations.
- Load Tests: Dispatched parallel multi-agent panels to audit test scenarios — decided to keep the suite intact while cleaning up dead docstrings.
- Additional works: Drafted a Gateway team handoff doc, instrumented user-action tracking in the web dashboard, and planned event-log scaling for analytics.
AI Usage Summary
Aaron leaned on the AI's multi-agent capabilities for safe code reviews and impact analysis. Mild friction when automated schema edits got blocked by cross-repo security guards.
Spike Profile
Friday's tokens ran 3.4× the weekly baseline, isolated to the data-extraction service. Spike began 14:22 UTC, peaked 16:50 UTC.
Root Cause
A missing batching guard caused per-row prompts instead of batched-50. Regression from commit a7f2c89 (no batching test in CI).
Mitigation
- Hotfix at 17:10 UTC; spend normal by Saturday.
- Added a
batchSize >= 25invariant to the CI gate. - Postmortem: Maya, due Wednesday.
Signals This Week
Four contributors logging late-night AI activity above baseline for the third straight week:
- Avery — peak 11pm–1am
- Diego — weekend prompts up 4.2×
- Priya — workday past 9pm three nights running
- Tom — commits clustered after 10pm
Actions
- Stand-down nudges sent via Slack DM.
- Managers notified to schedule check-ins.
- Pattern correlates with the platform migration — likely transient, but worth monitoring.
Six features shipped in 72 hours, all with AI-assisted authorship over 60%.
Customer-Facing
- Inline citations in chat responses — every reply links back to source documents. Initial open rate: 38% of users click at least one citation.
- Bulk export for usage reports — the most-requested SMB feature this quarter. 240+ exports run in the first 48 hours.
Internal Tooling
proxai-ops billing audit— CLI for finance to audit per-org spend without engineering.- Sentry routing rules v2 — 60% of low-priority alerts moved to the noisy-bin queue.
- Database snapshot rotation — automated weekly retention with
--dry-run. /who-ownsSlack command — answers "who owns this service?" against the on-call roster.
Quality
Rollback rate held at zero across all six. Two had Day-1 telemetry tweaks; none required reverting.
AI Usage Summary
AI carried boilerplate, migration scripts, and test scaffolding. Business logic and edge cases stayed human-authored — consistent with "AI for substrate, humans for judgment".
Suggested Claude Code skills:
adversarial-review— sub-agent critique with "fresh eyes" until feedback becomes trivial. Catches blind spots Mia's self-review keeps missing.mcp-builder— scaffolds MCP servers from a spec. Mia's been hand-rolling tool-call schemas for two weeks.babysit-pr— monitors PRs through CI, retries flaky tests, resolves conflicts, enables auto-merge.
Matches Mia's review-heavy commit pattern this sprint.
Infra-team memory — patterns scoped to services/ and app/api/ only.
Promoted within team:
- "Always use
nestConnectionFetchV2, notnestConnectionFetch" — 4 of 6 Infra members had this in auto-memory after legacy-fetch bugs. Scoped toservices/. - "Migration files use the
YYYYMMDD_<name>.sqlprefix" — surfaced from 3 Infra postmortems.
Loads only for sessions touching infra paths. Saves ~60 tokens on every non-infra session.
A post-launch bug-fix wave. Tyler focused on smoothing the CLI experience and tidying source integrations.
Today's Build Focus
- CLI Versioning & Releases: Fixed an upgrade parsing bug where same-day multi-releases (
v2026.6.1-1vs-2) compared incorrectly. Verified curl reinstall paths preserve data end-to-end. - CLI UX & Daemon State: Standardized runtime status messages, fixed a false-positive daemon-running report, and patched the timestamp logic in
inferDaemonAlive. Toned down alarming red text indoctor. - Gateway Source Cleanup: Removed the deprecated
gemini-cliintegration and documentedclaude-desktopas its replacement across the internal knowledge base. - Additional works: Audited GitHub Actions to confirm strict Node 24 enforcement.
AI Usage Summary
The AI nailed the counterintuitive version-string splits and the daemon over-reporting root cause. Mild redirect needed when it initially searched the wrong files for CLI status strings.
Where new hires stall:
- Auth setup: 2.3× longer than the rest of onboarding combined. Stalls happen at the SSO provisioning page (manual ticket workflow).
- Local env setup: 1.4× target time, mostly Docker image pulls.
Recommendation: prioritize the auth-page docs revision. Local env can wait for the Docker registry mirror.
Traffic Distribution
The top 10 prompts account for 71% of total agent traffic; the top 3 alone for 38%.
Performance Degradation
Three prompts show degraded p95 latency since last Thursday's model swap (Sonnet 4.5 → 4.6):
summarize-meeting-transcript— +180ms (920ms → 1.10s)extract-action-items— +240msgenerate-onboarding-checklist— +90ms
The other 7 held steady or improved. Net cost-per-call is down 12% despite the regression.
Optimization Candidates
summarize-meeting-transcript— prime caching candidate; 80% of inputs are the same five recurring meetings.extract-action-items— redundant parent-agent context. Trimming to user-message + last-2-bot-messages cuts input tokens ~40%.generate-onboarding-checklist— switch to Haiku; task is structured, no Sonnet-tier reasoning needed.
Pending Decisions
Maya wants confirmation before swapping the onboarding-checklist model — eval accuracy is 96% vs 94% on Haiku. Acceptable tradeoff but needs sign-off.
This Week's Pushback Rate
Agents pushed back on 18% of risky operations, up from 11% — new safety guardrails doing their job.
What got pushed back:
- 47% — schema migrations without an explicit dry-run flag
- 31% — credential rotations outside the maintenance window
- 14% — file deletions in
/prod/paths - 8% — outbound webhooks to unverified domains
Override Rate
8% were overridden — mostly planned migrations where operators had context the agent didn't.
A maintenance day focused on stabilizing test pipelines and internal CLI tooling.
Today's Build Focus
- Coverage Workflow: Overhauled
test:cov— integrated Bun's native--retryfor flaky tests, switched to strict serial execution to prevent coverage corruption, and audited incidental coverage gaps for isolation accuracy. - ProxAI Ops CLI: Refactored the
po gatewaytest architecture to long-running on-demand Docker containers for cross-platform testing. Squashed a Windows path-separator bug hiding CI failures. Cleaner terminal output with color streams and decimal percentages. - Gateway Unit Tests: Tracked down mock scoping and teardown issues in isolated suites, investigated GitHub Actions failures, and bumped the runtime to Bun 1.3.14.
AI Usage Summary
Rachel relied on AI to diagnose obscure cross-platform Windows CI bugs and clarify Bun's parallelization behaviors — confident lockdown of the test architecture.
Two agent responses surfaced customer PII fragments. Both caught by the redaction layer before egress — no data leaked.
Policy review Wednesday on whether to escalate the upstream input filter.
Keys Rotated This Week
- Three keys surfaced in stale repos during the weekly scan.
- Two were scoped to expired projects (no impact).
- The third was a live key in a fork of
proxai-experiments— rotated within 4 minutes.
Incident review opened on the third. Postmortem due Friday.
Cross-user scan picked up two more conventions the codebase has been silently adopting.
Promoted:
- "Prefer named exports over default exports" — 11 users had this in personal memory; the codebase has been 92% named since Q2. Now codified.
- "Use
bun:testforai/mapper/, vitest everywhere else" — 6 users hit this gotcha; promoted to a path-scoped rule.
Removed:
- Old
pnpm installnote — 4 months stale; nobody had it locally anymore.
Net: org CLAUDE.md 9 lines lighter, one fewer onboarding gotcha.
Suggested Claude Code skills:
firecrawl— turn any URL into clean LLM-ready data. Replaces Noah's hand-rolled scraping.pdf-skills— read/write PDFs natively. Fits Noah's document-ingestion pipeline work.cherry-pick-prod— isolated worktree → cherry-pick → conflict resolution → PR from template. Noah leads the on-call rotation.
A few hours of setup unlocks all three.
Trend
Daily AI invocations down 18% from the prior 5-day window. Concentrated in the Analytics team (–47%); other teams held steady or grew slightly.
Possible Causes
- Analytics shipped a major dashboard release Monday — likely deprioritizing AI experimentation.
- Three power users on PTO this week.
- Finance's "AI usage budget" memo went out Wednesday.
Next Step
User survey Monday to confirm root cause and rule out tool-friction regression.
Headline
Six agent loops burned 10.4M tokens this week without producing accepted output. No customer impact, but the spend is fully attributable to retry storms.
Top Offenders
scrape-and-rank-leads— 4.1M tokens. Tuesday's vendor outage caused 200+ retries per item before the circuit-breaker kicked in.generate-sales-followup— 2.7M tokens. Same outage, same retry pattern.extract-pricing-from-pdf— 1.8M tokens. A malformed PDF caused 700+ retries with no successful parse.- 3 smaller offenders — 1.8M tokens collectively, all retry-loop-related.
Mitigation Proposed
- Circuit-breaker config — abort retries after 10 failures from the same upstream within 60s. Patch ready.
- Per-loop token budget — abort runs exceeding 500K tokens without committed output. Patch in design.
- Dead-letter queue — divert post-3-failure parse attempts to manual triage.
Cost Impact
10.4M tokens ≈ $31 direct cost, but the same compute was unavailable to other workloads for hours.
Owner
Brennan owns the circuit-breaker. Maya owns the token budget. Both due Friday.
A maintenance day. Elliot focused on frontend refinements and UI alignments for project tracking.
Today's Build Focus
- Project Tracker & Chat Drawer: Fleshed out the "Conversations" popup metadata — duration, tokens, and platform details on two clean lines. Resolved a UI flickering issue with
keepPreviousDataduring record transitions. - Credential Link & Dialogs: Iterated on a "Copy Credentials" control, pivoting from an underlined link to a clean icon button. Scoped
DialogContentclasses so popups align top-of-window without affecting other pages. - Agent Call Records & Backend: Used the
proxai-opsCLI to query ACR counts by breadcrumb and added a Sentry regression test assertion for loop breaks in the AI Core server. - Additional works: Investigated hardcoded agent text in the landing page's Command Room section.
AI Usage Summary
Elliot used AI as a pair programmer for React component iterations, consistently for UI state solutions and scoped Tailwind styling.
Growth-team memory — patterns scoped to app/(marketing)/ and content/.
Promoted within team:
- "Marketing pages use
<Typography variant='display'>, not custom h1 styling" — surfaced from 5 Growth members. - "Landing copy goes through
lib/copy.ts; never inline in JSX" — 3 members got bit by typos before catching this.
Scoped loading means non-Growth sessions skip these 4 rules entirely.
Set Up in 3 Minutes
From signup to first call in three steps — no code changes, no vendor lock-in.
Create your organization
Sign up and create your organization. Add users and invite them to the organization.
Install the Gateway
Every member runs the installer once, then points it at their gateway key.
The Gateway routes every model call through ProxAI — zero code changes.
curl -fsSL https://github.com/proxai/proxai_gateway/raw/main/install.sh | bashproxai-gateway setup <your-gateway-api-key>You're set and ready
The fleet is live and the command room is populated!
We are The Experts
Why is ProxAI the best tool for taking control of your AI integration?
Command Room Fleet
Small, fast, robust, and autonomous ProxAI agents to build your organization's AI Command Room.

Orkun Duman
CEO of Chunky Tofu Studios, Ex-Google
UC Berkeley
“ProxAI speeded up our development process for word games significantly. Keeping track of which AI models are the best for each task was a time consuming task for us. ProxAI made it easy to switch between models and get the best results.”





Questions
&
Answers
ProxAI is an AI Command Room: a fleet of small, fast, autonomous agents that route every model call through a single gateway and give engineering leadership full control over how AI is actually used across the company. Two agents are live today — Project Tree Generator and Progress Summarizer — and 80+ more are in active development, backed by a research pipeline built with ex-Google Brain and DeepMind expertise.
Two agents are live today. The Project Tree Generator maps every project across your team and shows which AI activity belongs to which workstream. The Progress Summarizer turns the day-to-day model-call activity into shipping summaries — who's moving, who's stuck, where AI is actually being applied. Coming soon: 80+ more agents covering policy (PII detection, hate-speech blocks), tracking (employee burnout, high-value deals), briefs (token-usage rollups, weekly trends), velocity (top-performing projects, idle-project warnings), and knowledge (tree organizer, document updater) — each one small, single-purpose, and continuously analyzing the gateway feed.
Every team member installs the Gateway once with a one-line installer and points it at their Gateway API key. From that moment on, every AI call from their machine — Cursor, Claude Code, Codex, Gemini CLI, Antigravity, Copilot — routes through the Gateway before reaching the provider. No code changes, no SDKs to wire in, no workflow disruption. The Gateway captures every prompt, response, and piece of metadata, and that captured stream is the substrate the agent fleet analyzes in real time.
The SDK is the developer-side complement to the Gateway. Drop it into your application code in place of the OpenAI / Anthropic / Google client, and you get the same unified routing, observability, and policy layer — but for programmatic LLM calls (background workers, batch jobs, production agents) instead of interactive coding-tool use. The same fleet of agents analyzes both surfaces, so your dashboard shows one unified view whether the call came from Cursor or from a production cron job.
Account deletion permanently removes every prompt, response, agent output, organization setting, and team membership from ProxAI within 30 days. We do not retain anonymized copies for "research", we do not sell anything to third parties, and the deletion cascades across gateway logs, agent-generated outputs, and our backup systems. You receive an email confirmation when the deletion is complete.
The SDK installs without an account, but to get any of the actual value — agent insights, dashboard observability, policy enforcement, the Command Room fleet — you need a free account and a Gateway API key. Setup takes about three minutes from signup to first call. Without an account, the SDK can still proxy your calls locally, but you're not yet plugged into the fleet that turns those calls into insight.
Two reasons. First, each agent is small, fast, and built for one narrow job — Velocity for speed-tuned answers, Lean for high-volume cost optimization, Dependency Graph for multi-step workflows, Trigger for event-driven response, Adapt for self-tuning from every run, Collaboration for multi-owner coordination. Single-purpose agents stay sharp where giant generalists go diffuse. Second, the team building them: former Google Brain and DeepMind researchers, math and computer olympiad medalists, competitive WUDC debaters. Cutting-edge research goes straight into production agents, not into white papers.
Swarm intelligence is the principle that many small, specialized agents acting in coordination can produce smarter, more robust behavior than a single large system — it's how ant colonies build complex structures and how flocks of birds navigate without a central leader. ProxAI applies this to AI governance: instead of one monolithic model trying to do everything, the Command Room runs a fleet of 80+ small agents — each specialized for a narrow task (policy, briefing, knowledge, tracking, velocity) — that coordinate through the Gateway. The result is faster decisions, more robust behavior under failure, and the ability to cover broad organizational needs without any single point of failure.
Ready to Start?
ProxAI is simple to use. Onboard your team members and start to get value within 24 hours.