The Quest to Overcome Key Challenges in Large Language Models

Large language models (LLMs) have rapidly risen to prominence, demonstrating impressive capabilities on a range of…

New Soft Mixture-of-Experts Model Sets New Benchmarks for Image Classification

A new paper from researchers at Google DeepMind proposes Soft Mixture-of-Experts (Soft MoE), a novel sparse…

Microsoft Unveils DeepSpeed-Chat to Democratize Training of Large Conversational AI Models

DeepSpeed-Chat is a new system introduced by Microsoft Researchers to make training large conversational AI models…

Uncovering How AI Masters New Senses

A new study from MIT CSAIL reveals how large language models like GPT-3 learn to integrate…

New Research Improves Reliability of AI Watermarking Techniques

A new highly technical paper from researchers at Inria, Imatag and Meta AI proposes methods to…

Virtual Prompt Injection: A Novel Threat to Language Models

A new paper from researchers at University of Southern California, Samsung Research America, and University of…

The Rise of Gorilla: A New AI System Surpassing GPT-4 for API Usage

A new AI system called Gorilla has emerged that demonstrates superior performance to even the mighty…

Large Language Models Can Do “Parallel Decoding”

A new technique called “Skeleton-of-Thought” (SoT) shows promise for significantly speeding up text generation from large…

Leveraging Language Models to Enhance Personalized Recommendations

A new study published in arXiv explores prompting strategies to improve personalized recommendations using large language…

New Study Finds Biases Limit Benefits of Human-AI Collaboration in Radiology

A new experimental study published in a top economics journal has found that biases in how…

AI-Generated Product Ideas Outperform Humans in Quality and Quantity

A new study from researchers suggests that large language models (LLMs) like ChatGPT can generate higher…

Code Generation Gets a Boost from PanGu-Coder2

A new AI system called PanGu-Coder2 is poised to advance the state of the art in…

LLaMA-2-7B-32K Pushes the Limits of Context Length

Together AI, an AI research company, published a post detailing their work on extending the context…

Adversarial Attacks Reveal Cracks in LLM Alignment

A new paper from researchers at CMU and others reveals systemic vulnerabilities in current techniques aimed…

Scaling TransNormer to 175 Billion Parameters

The field of natural language processing has seen monumental advances with the rise of large language…

Reasoning or Rambling? New Study Questions Logic Behind AI Reasoning

A new paper from Stanford researchers calls into question whether prompting large language models like GPT-3…

Calibration Techniques Improve Probability Estimates from Machine Learning Models

A recent study published in the Proceedings of Machine Learning Research investigated methods for calibrating probabilistic…

New Benchmark Tests the Limits of AI Reasoning Abilities

A new benchmark dataset called the Advanced Reasoning Benchmark (ARB) aims to push artificial intelligence systems…

New Web Agent to Navigate Real Websites

A team of researchers from Google DeepMind and University of Tokyo has developed a new web…

BTLM-3B-8K: Performance in a 3 Billion Parameter Model

A new language model called BTLM-3B-8K has achieved state-of-the-art accuracy among 3 billion parameter models, rivaling…