Yarn-Mistral-7b-128k

2023-11-03

The Nous-Yarn-Mistral-7b-128k is a cutting-edge language model designed for long context. It’s an extension of Mistral-7B-v0.1, supporting a 128k token context window. The model, which was trained on the JUWELS supercomputer, shows minimal quality degradation in short context benchmarks. It requires the latest version of transformers and the trust_remote_code parameter set to True for usage.
Read more…

Yarn-Mistral-7b-128k

Related

Grok 3 API Debuts with Scalable Models for Code, Data, and Enterprise Tasks

Smarter GitHub Automation with the MCP Server

China Unveils GPMI: A Single-Cable Standard for 8K Video and High Power

When Weather Apps Steal Your SSH Keys

Llama 4

Tame Your Terminal: Managing AI Coding Agents with Claude Squad

Command Smarts: Exploring the Power of MCP Tools

Shingles Vaccine Linked to Lower Dementia Risk in Long-Term Study

DeepMind’s Silence: How Openness in AI Research Is Fading