Bureaucracy is all you need.
A manifesto for paying more attention to how AI agents communicate.
Lately, there’s been a lot of talk about AI agents. Most of this talk has been using the idioms of traditional software - graphs, pipelines, and prompts – terminology borrowed from computer science and mathematics. and so forth. However, we are not using traditional software. We are using Large Language Models. If traditional software is an assembly line, large language models are more akin to a random dude off the street - they are explicitly meant to simulate human behaviour.
A traditional program operates with uncompromising determinism. Given identical inputs, it produces identical outputs. Its possible states remain discrete and limited. Users interact through prescribed channels – forms, buttons, APIs. This rigid structure enables mathematical description and precise quantification.
LLMs operate in probability spaces. Modern models achieve 90% accuracy across tasks but lack consistency. They inhabit vast state spaces that dwarf traditional programs. Through training on human-generated data, they mirror human behavioral patterns. This includes reproducing human limitations, struggling with mathematics and generating fake facts that support narratives to please their counterparty.
This probabilistic nature might suggest unsuitability for practical applications. Yet humans, who share these characteristics, form the foundation of civilization. People demonstrate unreliability and creativity while operating across expansive state spaces. Society functions despite – or perhaps because of – these traits.
Therefore, to improve AI system design, we’ll be best served by looking at lessons from fields like history, political science, and organizational design rather than graph theory. How does society transmute unreliable components to reliable systems? What human organizations successfully accomplish similar objectives to those of AI ensembles?
The answer, of course, is bureaucracy. Bureaucracies and hierarchies implement checks and balances. Insurance policies protect against human error. Change requests and status meetings create accountability. These mechanisms convert humans’ inconsistent (probabilistic) individual outputs into reliable systems.
And what is bureaucracy? Bureaucracy is merely a formalization of communication; it is the underlying communication infrastructure of an organization. The capacity to transmit information and await responses enables coordination. Most knowledge workers derive value from coordinating their actions with others. Writing an article or a piece of code with no target audience or goal is meaningless, and the only way to determine what the proper goal for a piece of work should be, is through communication.
This points toward a new paradigm for AI systems, particularly agentic ones. Instead of focusing solely on tools, we should implement bureaucracy, or in other words, communication protocols and specific responsibilities.
Picture an AI agent named Dave, tasked with identifying medical research articles, working with Pat, who specializes in summarization. Their interaction protocols mandate regular communication about search criteria and progress updates. Through these regular communications, Pat and Dave, although they are each probabilistically correct only 90% of the time, through their mutual communications and checks, since they’re likely not wrong at the same time, they could ensure that the system as a whole operates at above 90% accuracy.
Through this framework of agents with distinct responsibilities and communication requirements, we can construct hierarchies and organizations. These structures could transform unreliable LLM components into systems that produce consistent, high-quality outputs.
This organizational approach acknowledges the probabilistic nature of LLMs and provides mechanisms to enhance reliability. It mirrors how human societies handle individual variability through structural solutions. By accepting that LLMs share characteristics with human cognition – including limitations – we can adapt proven organizational patterns to AI systems.
Therefore, I think that, if we are to move forward with building ever larger AI systems, we should pay more and more attention to historically how people have gone about building larger and larger organizations. Specifically, we should pay attention to the bureaucracies and communication protocols they used to scale up these organizations. By recognizing LLMs as entities more akin to humans than traditional programs, we can build systems that harness their capabilities while mitigating their limitations. The future of AI might not lie in perfecting algorithms but in perfecting organizations. MBAs might soon become more important than software engineers (and I say this as an engineering major and software developer.)
In a future blog, I aim to make this more concrete by building a system that would let you build your own bureaucracy by specifying the briefs for agents, as well as letting them communicate among each other and defining each’s responsibility.



I am wondering whether if it is true to say that if you combine the input of two agents, both of who are only 90% accurate, that their combined accuracy would go up. Taking a simple example: If I am 100% accurate, and someone who is 80% accurate checks my work, then self evidently overall accuracy would, in be lower than 100%.