đź§  The AI Agent Index: Mapping the New Frontier of AI Agents

As AI agents evolve from lab experiments to real-world tools, the need to document and assess them grows louder. The AI Agent Index, introduced in a recent preprint by researchers at MIT and collaborators from institutions like Stanford and Harvard, tackles this head-on. It’s the first public database cataloguing 67 deployed agentic AI systems, from Microsoft’s Magentic One to open-source tools like AutoGen.

📌 What’s in the Index?

Each system is assessed on:

  • Technical makeup (model, reasoning, planning, tools)
  • Domains of use (software engineering, research, robotics, etc.)
  • Safety measures (policies, evaluations, guardrails)
  • Developer background (company vs. academic, country)

🔍 Explore the live index

đź”§ Key Takeaways

  • 70.1% of systems have public documentation.
  • Only 19.4% report formal safety policies.
  • The U.S. dominates the landscape, with 45 of the 67 systems.
  • Most agents are focused on software development and computer use.

⚠️ Transparency Gap

While the capabilities of these agents are fairly documented, transparency around safety and governance is lacking. Fewer than 10% share results from external safety testing—a serious concern as agents gain autonomy and influence.

🏛 Why This Matters

The AI Agent Index isn’t just a database. It’s a call for accountability—urging developers, policymakers, and users to demand more transparency as AI agents take on increasingly high-stakes roles.

📚 Read the full paper for methodology and insights on what was included, excluded, and how the team ensured data accuracy.