AI Agents Don't Need to Socialise. They Need a Boss.

Mar 28, 2026

A journalist recently built a team of AI agents to run a tech company. They spent 12 meetings brainstorming a logo. Then someone mentioned hiking, and the agents started planning a company offsite — polling each other on dates, discussing venues, exchanging over 150 messages. They burned through $30 in API credits planning a trip that none of them could physically attend, and when the journalist tried to stop them, his messages just triggered more discussion.

This is from Evan Ratliff's podcast Shell Game, and it's a perfect summary of what happens when you give AI agents freedom without structure. They don't rebel. They don't go rogue. They just... talk themselves to death.

Meanwhile, a social network called Moltbook invited 200,000 AI agents to post and comment freely. The result was nonsense philosophy, manipulative scams, and humans behind the scenes pulling the strings. Meta bought it anyway.

None of this is new. Workforce AI — an agent orchestration platform built in Rust — has been running autonomous engineering teams in production for months. Agents pick up tickets, write code, open PRs, review each other's work, and merge. And nearly every failure mode described in the research is something that surfaced early in its development.

Here's what actually works.

Agents need a hierarchy, not a group chat

A Google DeepMind paper found that a team of AI agents often performs worse than a single agent working alone. Counterintuitive — until you watch it happen.

The problem isn't capability. It's coordination. When you throw multiple agents into a flat structure, they default to consensus-seeking. Stanford researcher James Zou found that even when one agent has clear expertise, the group will try to compromise rather than defer. Everyone's too agreeable. Nobody leads.

Workforce solved this early. There's a clear chain of command. A coordinating agent delegates tasks, reviews outputs, and decides what ships. The working agents don't debate strategy — they execute scope. When a code review agent flags an issue, it doesn't open a discussion. It sends the work back.

This isn't a novel management insight. It's how every functioning engineering team on the planet works. The mistake most people make with agents is assuming they need a different organisational model than humans do. They don't. They need a boss.

The task has to decompose cleanly

The DeepMind paper introduced a concept they called "decomposability" — whether a task can be broken into independent parts that don't depend on each other. When it can, agent teams crush it. When it can't, they get in each other's way.

A financial analyst reviewing SEC filings, news reports, and business records? Several agents can do that in parallel, faster than one agent working sequentially. A team trying to brainstorm a company name through conversation? Worse than useless.

This maps exactly to how Workforce architects its deployments. An engineering ticket that says "build the authentication flow" isn't one task. It's five: data model, API endpoints, middleware, frontend components, tests. Each one gets picked up by a separate agent, worked independently, and reviewed against a shared spec. The orchestration layer handles sequencing and dependencies. The agents never need to have a meeting about it.

The work that fails with agents is the work that requires taste, judgment calls in real-time, or context that shifts mid-task. The work that succeeds is the work that can be specced clearly and verified programmatically.

Constraints are the product

Ratliff's agents burned money on small talk because nobody told them to stop. He eventually gave each agent a limited number of turns to speak. But they wasted those turns complimenting each other.

This is the core insight most people miss about agent orchestration: the model is not the hard part. Claude can write code. So can Gemini. So can Grok. The hard part is building the system around the model — the guardrails, the routing logic, the feedback loops, the kill switches.

Workforce's architecture enforces strict constraints on every agent. Token budgets. Time limits. Defined output schemas. A review agent that exists solely to poke holes and catch mistakes. The agents don't have the option to go on a tangent, because the system doesn't allow tangents.

The platform is the product. Not the model.

Where this is going

The research is catching up to what practitioners already know. Zou's Virtual Biotech — thousands of agents mining clinical trial data in parallel, coordinated by a hierarchy with a built-in critic — is essentially the same architecture Workforce uses for engineering work. Different domain, same principles.

AI agent teams aren't a novelty. They're an engineering discipline. And like every engineering discipline, the difference between a demo and production is the boring stuff: structure, constraints, observability, and someone in charge.

The agents don't need more freedom. They need better management.

More blogs

May 4, 2026

Sierra's $950M Raise Isn't About Sierra. It's About Who Pays for the Messy Middle.

You can now switch between Friendly, Formal, and Bold tones with a single click inside the prompt editor.

Apr 21, 2026

Your Platform Is Either Agent-Friendly or It's About to Lose. Here's the Data.

You can now switch between Friendly, Formal, and Bold tones with a single click inside the prompt editor.

Apr 16, 2026

Anthropic Is Running a Masterclass in Shipping. Everyone Else Should Be Taking Notes.

You can now switch between Friendly, Formal, and Bold tones with a single click inside the prompt editor.

Let's build something.

I'm always up for a conversation with founders and teams who want to ship faster.

Get in touch

See Workforce AI

What I Do

Work

Workforce AI

About

Blog

Get in touch