Multi-agent AI systems are increasingly popular for complex tasks. But what happens when LLMs discuss among themselves? Our research in Belief-Dynamics reveals some concerning patterns.
The Experiment Setup
We created a simple debate framework:
- 3-5 LLM agents discuss a factual question
- Each agent can see others' responses
- Agents vote on the final answer
Surprising Result: 49% Hallucination Rate
In naive debate (agents simply discussing), we observed a 49% rate of false consensus - agents agreeing on incorrect answers. Key patterns:
1. Confidence Cascade
When one agent states something confidently (even if wrong), others tend to agree rather than challenge.
2. Authority Bias
The first agent to respond often sets the direction for the entire debate, regardless of accuracy.
3. Echo Chamber Effect
Agents reinforce each other's mistakes, leading to collective hallucination.
Mitigation Strategies
Our ongoing experiments test various interventions:
- Devil's Advocate: One agent always argues the contrary
- Confidence Weighting: Require probability estimates
- Evidence Requirements: Agents must cite sources
- Multi-round Refinement: Allow agents to revise positions
Implications
These findings have important implications for multi-agent AI systems:
- Simple majority voting is dangerous
- Diversity of perspectives matters
- External verification is essential
- Debate protocols need careful design
As we build more sophisticated multi-agent systems, understanding these dynamics becomes critical.