A compilation of the best multi-agent papers
🐦 Twitter • 📢 Discord • Swarms Website • 📙 Framework
This is an awesome list of the best multi-agent research papers compiled by the Swarms Team. Our mission at Swarms is to research multi-agent systems, democratize them at scale, and enable their adoption in the world economy. Join our Discord Now!
- [Paper Name] [Description:] [Link]
- Thought Communication in Multiagent Collaboration
- MAS-Zero: Designing Multi-Agent Systems with Zero Supervision
- K-Level Reasoning with Large Language Models
- More Agents is All You Need
- LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration
- AgentScope: A Flexible yet Robust Multi-Agent Platform
- Learning to Decode Collaboratively with Multiple Language Models
- AIOS: LLM Agent Operating System
- AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
- Chain of Agents: Large Language Models Collaborating on Long-Context Tasks
- Mixture-of-Agents Enhances Large Language Model Capabilities
- EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms
- Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence
- Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning
- Optimizing Collaboration of LLM based Agents
- LLM-Agent-UMF: LLM-based Agent Unified Modeling Framework
- Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System
- SwarmAgentic: Towards Fully Automated Agentic System Generation via Swarm Intelligence
- AGENTSNET: Coordination and Collaborative Reasoning in Multi-Agent LLMs
- LLM Economist: Large Population Models and Mechanism Design in Multi-Agent Generative Simulacra
- SOTOPIA-RL: REWARD DESIGN FOR SOCIAL INTELLIGENCE
- Talk Isn't Always Cheap: Understanding Failure Modes in Multi-Agent Debate
- Virtual Agent Economies
- Federation of Agents: A Semantics-Aware Communication Fabric for Large-Scale Agentic AI
- In-the-Flow Agentic System Optimization for Effective Planning and Tool Use
- Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models Beneficial?
- TUMIX: Multi-Agent Test-Time Scaling with Tool-Use Mixture
- Achilles Heel of Distributed Multi-Agent Systems
- CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society
- Multi-Agent Collaboration: Harnessing the Power of Intelligent LLM Agents
- Rethinking the Bounds of LLM Reasoning: Are Multi-Agent Discussions the Key?
- MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration
- LLM Harmony: Multi-Agent Communication for Problem Solving
- Multi-Agent Consensus Seeking via Large Language Models
- Shall We Team Up: Exploring Spontaneous Cooperation of Competing LLM Agents
- Cooperative Strategic Planning Enhances Reasoning Capabilities in Large Language Models
- SMoA: Improving Multi-agent Large Language Models with Sparse Mixture-of-Agents
- Self-Adaptive Large Language Model (LLM)-Based Multiagent Systems
- A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis
- Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination
- Unlocking the Power of Multi-Agent LLM for Reasoning: From Lazy Agents to Deliberation
- The Station: An Open-World Environment for AI-Driven Discovery
- MALLM: Multi-Agent Large Language Models Framework
- AgentGym: Evolving Large Language Model-based Agents across Diverse Environments
- Very Large-Scale Multi-Agent Simulation in AgentScope
- AgentClinic: A Multimodal Agent Benchmark for AI in Clinical Environments
- MultiAgentBench: Evaluating the Collaboration and Competition of LLM Agents
- TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
- BoxingGym: Benchmarking Progress in Automated Experimental Design
- Symphony: A Decentralized Multi-Agent Framework for Scalable Collective Intelligence
- CoMAS: Co-Evolving Multi-Agent Systems via Interaction Rewards
- The Collaboration Gap
- API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs
- AutoAgents: A Framework for Automatic Agent Generation
- MAS-GPT: Training LLMs to Build LLM-based Multi-Agent Systems
- Multi-Agent Collaboration via Evolving Orchestration
- Automated Unit Test Improvement using Large Language Models
- Experiential Co-Learning of Software-Developing Agents
- ChatDev: Communicative Agents for Software Development
- MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution
- CodeR: Issue Resolving with Multi-Agent and Task Graphs
- From LLMs to LLM-based Agents for Software Engineering: A Survey
- CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases
- Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents
- Large Language Model-Based Agents for Software Engineering: A Survey
- AutoSafeCoder: A Multi-Agent Framework for Securing LLM Code Generation
- RGD: Multi-LLM Based Agent Debugger via Refinement and Generation Guidance
- CoAct-1: Computer-using Agents with Coding as Actions
- Self-Organized Agents: A LLM Multi-Agent Framework toward Ultra Large-Scale Code Generation and Optimization
- Human-In-the-Loop Software Development Agents
- LLM-Based Multi-Agent Systems for Software Engineering: Vision and the Road Ahead
- LLM-Based Multi-Agent Systems for Software Engineering: Literature Review, Vision and the Road Ahead
- Tailoring with Targeted Precision: Edit-Based Agents for Open-Domain Procedure Customization
- Agents4PLC: Automating Closed-loop PLC Code Generation and Verification in Industrial Control Systems using LLM-based Agents
- Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents
- MEDCO: Medical Education Copilots Based on A Multi-Agent Framework
- Multi Agent based Medical Assistant for Edge Devices
- Can AI Agents Design and Implement Drug Discovery Pipelines?
- Sequential Diagnosis with Language Models ||Open-Source Implementation Code Link
- Towards an AI co-scientist --- Implementation
- The Virtual Lab of AI agents designs new SARS-CoV-2 nanobodies
- Evolving Diagnostic Agents in a Virtual Clinical Environment
- MDAgents: An Adaptive Collaboration of LLMs for Medical Decision-Making
- Zodiac: A Cardiologist-Level LLM Framework for Multi-Agent Diagnostics
- MADD: Multi-Agent Drug Discovery Orchestra
- SciAgent: A Unified Multi-Agent System for Generalistic Scientific Reasoning
- LAMBDA: A Large Model Based Data Agent
- Agentic Retrieval-Augmented Generation for Time Series Analysis
- Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining
- AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML
- AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions
- DataLab: A Unified Platform for LLM-Powered Business Intelligence
- Mora: Enabling Generalist Video Generation via A Multi-Agent Framework
- Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation
- Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks
- MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception
- PC-Agent: A Hierarchical Multi-Agent Framework for Complex Task Automation on PC
- Improving Large Vision and Language Models by Learning from a Panel of Peers
- PixelCraft: A Multi-Agent System for High-Fidelity Visual Reasoning on Structured Images
- UFO: A UI-Focused Agent for Windows OS Interaction
- Multi-Agents Based on Large Language Models for Knowledge-based Visual Question Answering
- Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld
- Two Heads Are Better Than One: Collaborative LLM Embodied Agents for Human-Robot Interaction
- Human-level play in Diplomacy by combining language models with strategic reasoning
- CulturePark: Boosting Cross-cultural Understanding in Large Language Models
- Beyond Human Translation: Multi-Agent Collaboration for Translating Ultra-Long Literary Texts
- FanCric: Multi-Agentic Framework for Crafting Fantasy 11 Cricket Teams
- Can Large Language Models Grasp Legal Theories? Enhance Legal Reasoning with Multi-Agent Collaboration
- SQL-of-Thought: Multi-agentic Text-to-SQL with Guided Error Correction
- CoMet: Metaphor-Driven Covert Communication for Multi-Agent Language Games
- Richelieu: Self-Evolving LLM-Based Agents for AI Diplomacy
- Large Language Model for Participatory Urban Planning
- Large language model empowered participatory urban planning
- Agents on the Bench: Large Language Model Based Multi Agent Framework for Trustworthy Digital Justice
- AgentCourt: Simulating Court with Adversarial Evolvable Lawyer Agents
- MiniFed : Integrating LLM-based Agentic-Workflow for Simulating FOMC Meeting
- TRIZ Agents: A Multi-Agent LLM Approach for TRIZ-Based Innovation
- Multi-Agent Autonomous Driving Systems with Large Language Models: A Survey of Recent Advances
- KoMA: Knowledge-driven Multi-agent Framework for Autonomous Driving with Large Language Models
- A Multi-Agent Conversational Recommender System
- Enhancing Supermarket Robot Interaction: A Multi-Level LLM Conversational Interface for Handling Diverse Customer Intents
- Challenges Faced by Large Language Models in Solving Multi-Agent Flocking
- Scalable Multi-Robot Collaboration with Large Language Models: Centralized or Decentralized Systems?
- MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation
- ControlAgent: Automating Control System Design via Novel Integration of LLM Agents and Domain Expertise
- Wisdom of the Silicon Crowd: LLM Ensemble Prediction Capabilities Match Human Crowd Accuracy
- Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
- Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems
- Evolutionary Optimization of Model Merging Recipes
- Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models
- Constitutional AI: Harmlessness from AI Feedback
- On scalable oversight with weak LLMs judging strong LLMs
- ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate
- RouteLLM: An Open-Source Framework for Cost-Effective LLM Routing
- Agent-as-a-Judge: Evaluate Agents with Agents
- Adversarial Multi-Agent Evaluation of Large Language Models through Iterative Debates
- MALT: Improving Reasoning with Multi-Agent LLM Training
- Why Do Multi-Agent LLM Systems Fail?
- Improving LLM Reasoning with Multi-Agent Tree-of-Thought Validator Agent
- The Decrypto Benchmark for Multi-Agent Reasoning and Theory of Mind
- Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL
- Probe by Gaming: A Game-based Benchmark for Assessing Conceptual Knowledge in LLMs
- The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
- Stance Detection with Collaborative Role-Infused LLM-Based Agents
- Content Knowledge Identification with Multi-Agent Large Language Models (LLMs)
- Why Solving Multi-agent Path Finding with Large Language Model has not Succeeded Yet
- Reliable Decision-Making for Multi-Agent LLM Systems
- Collab-Overcooked: Benchmarking and Evaluating Large Language Models as Collaborative Agents
- AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks
- TwinMarket: A Scalable Behavioral and Social Simulation for Financial Markets
- Generative Agents: Interactive Simulacra of Human Behavior
- SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents
- Scaling Instructable Agents Across Many Simulated Worlds
- Scaling Synthetic Data Creation with 1,000,000,000 Personas
- Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents
- From Text to Life: On the Reciprocal Relationship between Artificial Life and Large Language Models
- Mindstorms in Natural Language-Based Societies of Mind
- Agents' Room: Narrative Generation through Multi-step Collaboration
- GenSim: A General Social Simulation Platform with Large Language Model based Agents
- Large Language Models can Achieve Social Balance
- Cultural Evolution of Cooperation among LLM Agents
- SDPO: Segment-Level Direct Preference Optimization for Social Agents
- AgentSociety: Large-Scale Simulation of LLM-Driven Generative Agents
- OASIS: Open Agent Social Interaction Simulations with One Million Agents
- LLM-Based Social Simulations Require a Boundary
- Social World Models
- Simulating Human-like Daily Activities with Desire-driven Autonomy
- Implicit Behavioral Alignment of Language Agents in High-Stakes Crowd Simulations
- AllHands: Ask Me Anything on Large-scale Verbatim Feedback via Large Language Models
- AgentInstruct: Toward Generative Teaching with Agentic Flows
- SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning
- Minstrel: Structural Prompt Generation with Multi-Agents Coordination for Non-AI Experts
- AFlow: Automating Agentic Workflow Generation
- Agents Thinking Fast and Slow: A Talker-Reasoner Architecture
- DynaSaur: Large Language Agents Beyond Predefined Actions
- LLMs as Method Actors: A Model for Prompt Engineering and Architecture
- Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation Model Internet Agents
- Automated Design of Agentic Systems
- The Fellowship of the LLMs: Multi-Agent Workflows for Synthetic Preference Optimization
- Multi-agent Architecture Search via Agentic Supernet
- Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems
- When One LLM Drools, Multi-LLM Collaboration Rules
- Enhancing Reasoning with Collaboration and Memory
- Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
- Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence
- MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding
- Recon-Act: A Self-Evolving Multi-Agent Browser-Use System via Web Reconnaissance, Tool Generation, and Task Execution
- Build Your Personalized Research Group: A Multiagent Framework for Continual and Interactive Science Automation
- Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1
- Search Self-play: Pushing the Frontier of Agent Capability without Supervision
- Communication to Completion: Modeling Collaborative Workflows with Intelligent Multi-Agent Communication
- ComProScanner: A multi-agent based framework for composition-property structured data extraction from scientific literature
- LLM Voting: Human Choices and AI Collective Decision Making
- Planning with Multi-Constraints via Collaborative Language Agents
- An Electoral Approach to Diversify LLM-based Multi-Agent Collective Decision-Making
- Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface
- Prompt Engineering Through the Lens of Optimal Control
- A Cooperative Multi-Agent Framework for Zero-Shot Named Entity Recognition
- Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent System
- PlotGen: Multi-Agent LLM-based Scientific Data Visualization via Multimodal Feedback
- Enhancing Anomaly Detection in Financial Markets with an LLM-based Multi-Agent Framework
- GenoTEX: An LLM Agent Benchmark for Automated Gene Expression Data Analysis
- GenoMAS: A Multi-Agent Framework for Scientific Discovery via Code-Driven Gene Expression Analysis
- aiXiv: A Next-Generation Open Access Ecosystem for Scientific Discovery Generated by AI Scientists
- The AI Scientist: The world's first AI system for automating scientific research
- The Alignment Waltz: Jointly Training Agents to Collaborate for Safety
- MACPO: Weak-to-Strong Alignment via Multi-Agent Contrastive Preference Optimization
- MAPoRL: Multi-Agent Post-Co-Training for Collaborative Large Language Models with Reinforcement Learning
- YOLO-MARL: You Only LLM Once for Multi-agent Reinforcement Learning
- LLM-based Multi-Agent Reinforcement Learning: Current and Future Directions
- Large Language Model based Multi-Agents: A Survey of Progress and Challenges
- A survey on LLM-based multi-agent systems: workflow, infrastructure, and challenges
- Multi-Agent Collaboration Mechanisms: A Survey of LLMs
- A Survey on LLM-based Multi-Agent System: Recent Advances and New Frontiers in Application
- A Survey on Large Language Model based Autonomous Agents
- Large Language Models: A Survey
- LLM Multi-Agent Systems: Challenges and Open Problems
- Multi-Agent Coordination across Diverse Applications: A Survey
- LLM-based Multi-Agent Systems: Techniques and Business Perspectives
- Small LLMs Are Weak Tool Learners: A Multi-LLM Agent
- PosterGen: Aesthetic-Aware Paper-to-Poster Generation via Multi-Agent LLMs
- MetaGPT: Meta Programming for Multi-Agent Collaborative Framework
- Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO
In the arxiv_bibtex.bib file, you can find the bibtex citations for all the papers in this repository.
Have a multi-agent paper that isn’t on the list? We welcome your contributions! Please open a Pull Request (PR) to add new papers and help us maintain this comprehensive and up-to-date resource for the multi-agent research community. By contributing, you enable others—especially newcomers—to access the latest research in a single, centralized repository. Thank you for helping the community grow!
Join the swarms community, the largest community of multi-agent researchers, engineers, and builders in the world. We're committed to researching and advancing multi-agent systems.
| Platform | Description | Link |
|---|---|---|
| 💬 Discord | Live chat and community support | Join Discord |
| Latest news and announcements | @kyegomez |