A new language model called BTLM-3B-8K has achieved state-of-the-art accuracy among 3 billion parameter models, rivaling the performance of models 2-3x its size. Developed by Cerebras, Opentensor and partners and trained on the newly unveiled Condor Galaxy 1 (CG-1) AI supercomputer, this model demonstrates it’s possible to fit advanced natural language capabilities into a small package.
BTLM-3B-8K scored higher than all other 3B models on almost every benchmark test. It even matched or exceeded the scores of several 7B models despite using 71% less training compute and having a 58% smaller memory footprint. When quantized to 4-bit precision, BTLM can run comfortably on devices with only 3GB of RAM like a base model iPhone.
This compact yet powerful model is ideal for deployment on mobile and edge devices. Rather than relying on cloud APIs, BTLM makes it possible to run AI locally on billions of smartphones and IoT devices. The model was commissioned by OpenTensor for use on their decentralized Bittensor network, providing an alternative to centralized providers like OpenAI.
The key to BTLM’s efficiency lies in its training process. The model was trained on SlimPajama, a deduplicated version of the RedPajama dataset containing only 627 billion tokens. Deduplication reduced the noise in the data, allowing high accuracy with less compute. The training scheme also utilized long variable-length sequences up to 8,192 tokens to improve understanding of contextual relationships.
BTLM demonstrates that strong natural language performance does not require hundreds of billions of parameters or massive datasets. Compact and efficient models like BTLM-3B-8K point the way toward democratizing access to AI capabilities for everyone. As more powerful models become available to everyday devices, we may see new applications and use cases we can barely imagine today.