Yarn-Mistral-7b-128k

The Nous-Yarn-Mistral-7b-128k is a cutting-edge language model designed for long context. It’s an extension of Mistral-7B-v0.1, supporting a 128k token context window. The model, which was trained on the JUWELS supercomputer, shows minimal quality degradation in short context benchmarks. It requires the latest version of transformers and the trust_remote_code parameter set to True for usage.
Read more…