A new AI system called Gorilla has emerged that demonstrates superior performance to even the mighty GPT-4 when it comes to using APIs (application programming interfaces). Developed by researchers at UC Berkeley, Gorilla represents an important advance in enabling AI systems to effectively leverage tools and programming interfaces.
The key innovation behind Gorilla is combining large language model training with a retrieval system that can tap into constantly updated API documentation. This allows Gorilla to keep up with frequently changing real-world APIs, while retaining strong language understanding capabilities.
In rigorous evaluations using a novel benchmark called APIBench, Gorilla outperformed GPT-4 and other leading AI systems in selecting appropriate APIs for various tasks. It also showed much lower rates of “hallucination” – invoking APIs that don’t actually exist. This reliability is crucial for practical applications.
The researchers highlight that accessing millions of APIs unlocks new capabilities for AI systems that go far beyond their innate knowledge. By mastering API usage, Gorilla and future models could accomplish complex goals like planning an entire conference just through natural language conversations.
The integration of retrieval systems with language models is a promising path to reduce hallucination and improve factual accuracy. As models become more powerful, mitigating these pitfalls will be critical.
Gorilla also demonstrated an ability to understand constraints and tradeoffs when selecting APIs, choosing ones that matched required accuracy and efficiency criteria. This nuanced reasoning ability will be important as AI takes on more advanced challenges.
By surpassing even GPT-4, previously considered the most capable AI system, Gorilla underscores that specialized training is still vital to reach expert-level performance on particular tasks. Combined with ever-larger models, such targeted tuning will likely produce future breakthroughs.
The release of Gorilla’s training approach, model code and datasets should accelerate research into advanced API usage. As this capability spreads across domains like science, finance and medicine, we may see AI transition from conversant tool to capable agent unlocking worlds of knowledge.
GitHub: https://github.com/ShishirPatil/gorilla
Paper: https://arxiv.org/pdf/2305.15334.pdf