New StarCoder models with 8k context

New versions of StarCoder: 1B, 3B and 7B models announced

1T tokens, 80+ programming languages with 8k context window, MQA & FIM!

StarCoderBase-1B, a 1B parameter model, has been trained on 80+ programming languages to generate code snippets. Using Multi Query Attention and a Fill-in-the-Middle objective, it can serve as a technical assistant, although the generated code may contain inefficiencies or bugs. The model, trained on GitHub code, respects permissive licenses and provides a search index for proper attribution. It was trained using 128 Tesla A100 GPUs over 11 days. The model is licensed under the BigCode OpenRAIL-M v1 license agreement.
Read more…

New StarCoder models with 8k context

Related

Tame Your Terminal: Managing AI Coding Agents with Claude Squad

Command Smarts: Exploring the Power of MCP Tools

Shingles Vaccine Linked to Lower Dementia Risk in Long-Term Study

DeepMind’s Silence: How Openness in AI Research Is Fading

Why Passwords Aren’t the Problem—But How We Use Them Is

Claude 3.7 Sonnet Set to Expand Context Window to 500K Tokens

IngressNightmare: Critical Flaws in NGINX Controller Expose Kubernetes Clusters to RCE

Google’s Gemini 2.5 Pro Thinks Slower to Answer Smarter

In Pursuit of Efficiency: Rethinking AI with DeepSeek-V3-0324