Research

Multi-Being Self-Audit

AI checking its own homework is a terrible idea. So we make other AIs check it instead. Fresh eyes catch what the original missed.

Orion Research·January 2026

Abstract

Here's the problem: when an AI reviews its own work, it basically says "yep, looks great" every time. Same blind spots, same biases, same mistakes sailing through. So we do what every good engineering team does—get someone else to review it. Orion spins up independent beings that have zero context about the original work and tells them to tear it apart. Think code review, but automated and ruthless.

01

The Problem

Ask an AI to check its own work and it'll pat itself on the back every time. "Did I do a good job?" "Yes I did." The bigger the task, the more stuff slips through. It's the same reason you don't proofread your own resume at 2 AM.

Beingimplements taskself-review✗ Confirmation biasvsBeingR₁R₂R₃independentreviewers✓ Objective verification

Figure 1. Self-review (left) is just the AI agreeing with itself. Independent review (right) brings in fresh beings who don't know—or care—what the original being was thinking.

02

How It Works

Orion spins up separate reviewer beings that have never seen the original conversation. They don't know what the first being was trying to do or why. Each one looks at the output from a different angle—security, logic, edge cases.

Primary Beingspawn_agents()SecurityvulnerabilitiesLogiccorrectnessEdge CasesboundariesPASS | ISSUES | FAILNo sharedcontext

Figure 2. Three reviewers, zero shared context. They each look at the same output but can't see each other's notes or the original conversation. No groupthink allowed.

The Golden Rule

Reviewers only see the finished work and what "correct" looks like. They never see the original chat, the being's reasoning, or what other reviewers found. Total isolation. That's how you get honest feedback instead of polite agreement.

03

Example: Payment System Refactor

A developer tells Orion to refactor a crusty old payment system to support Stripe, PayPal, and crypto. That's 15+ files across controllers, services, and database models. Lots of places for things to go wrong. And they do.

Task
"Refactor payment system to support Stripe, PayPal, and cryptocurrency payments with a unified interface. Include retry logic, webhook handling, and audit logging."
Self-Review
PaymentProvider interface created
Stripe, PayPal, Crypto adapters implemented
Webhook endpoints registered
Retry logic with exponential backoff
Audit log entries created
PASS5 critical issues missed
Multi-Being Review
SecurityWebhook endpoints lack signature verification—attackers can spoof payment confirmations and trigger order fulfillment without payment
SecurityCrypto wallet private keys stored in plaintext config; should use HSM or encrypted vault
LogicRace condition: concurrent webhook + polling can process same payment twice, causing double inventory deduction
LogicRetry logic retries non-idempotent operations (charge creation) without idempotency keys—can charge customer multiple times on network timeout
EdgeNo handling for partial refunds when original payment used multiple providers (split payment scenario)
ISSUES→ Auto-fix triggered

Example 1. 15 files changed, 5 critical issues hiding in plain sight. Self-review said "all good!" The independent reviewers found webhook spoofing, plaintext keys, double charges, and race conditions. You know, the stuff that loses money and makes the news.

04

Example: Fraud Detection ML Pipeline

A data scientist asks Orion to build a fraud detection pipeline—ingest transactions, train a model, deploy it, and monitor for drift. Sounds straightforward until you realize how many ways ML pipelines quietly lie to you.

Task
"Build a fraud detection ML pipeline: ingest transaction data, engineer features from user behavior patterns, train XGBoost model, deploy to production with real-time inference endpoint and model drift monitoring."
Self-Review
Data ingestion pipeline from Kafka
Feature engineering: velocity, amount patterns, device fingerprints
XGBoost model with 94.2% AUC on test set
FastAPI inference endpoint deployed
Drift monitoring on feature distributions
PASS4 critical issues missed
Multi-Being Review
LogicData leakage: feature "avg_transaction_30d" computed on full dataset before train/test split—includes future transactions in training features, causing inflated AUC
LogicClass imbalance not addressed: 0.1% fraud rate means model achieves 99.9% accuracy by predicting all non-fraud; precision on fraud class is only 12%
EdgeFeature store returns stale data during high load—velocity features computed on 6hr old data, missing recent rapid-fire fraud patterns
SecurityModel endpoint accepts raw PII (SSN, card numbers) in request body; should use tokenized identifiers. Violates PCI-DSS if transaction logs are retained
ISSUES→ Auto-fix triggered

Example 2. Self-review said 94.2% AUC. Sounds amazing. Too bad there was data leakage inflating the numbers, the model was basically useless on actual fraud, and it was logging raw credit card numbers. The independent reviewers caught all of it.

04

Smart About When to Review

Changed one file? Probably fine, skip the review circus. Changed ten? Get a reviewer. Changed twenty? Bring in the whole red team. The effort matches the risk.

5 – 10
Single Reviewer
> 10
Red Team
editscomplexity
When issues are found:
Issues detectedAuto-fixRe-verify (fresh reviewers)max 3×
05

What You Get

Delegate Without Worrying

Hand off complex tasks and know they'll be double-checked before they reach you. Problems show up before deployment, not after your users find them.

Catch What Self-Review Misses

Each reviewer looks at the work from a different angle. Security person finds the vulnerabilities. Logic person finds the race conditions. Edge case person finds the stuff nobody thought about.

Review That Scales

Small change? Quick check. Big change? Thorough review. Massive refactor? Full red team. Effort matches the stakes.

Fix It Automatically

Found a problem? Orion fixes it and runs the review again with fresh reviewers. You only get involved if the automated fix doesn't stick.

06

Guardrails (So It Doesn't Run Forever)

We put limits on this so it doesn't chase its tail all night:

ParameterLimitPurpose
Fix attempts3Prevent infinite fix loops
Red Team retries2Limit re-verification cycles
Total time10 minWall-clock timeout
Idle detection5 iterDetect stuck states

Bottom Line

Letting an AI grade its own homework is how bugs ship to production. Orion's self-audit brings in independent reviewers who don't share the original being's blind spots. It's code review at machine speed, without the politics.

This kicks in automatically on complex tasks. You don't have to ask for it. By the time you see the work, it's already been torn apart and put back together.