Mixtral of experts

Mistral AI has released Mixtral 8x7B, a high-quality sparse mixture of experts model (SMoE) that outperforms Llama 2 70B on most benchmarks and matches or exceeds GPT3.5. The model, which handles multiple languages and shows strong performance in code generation, is the strongest open-weight model with a permissive license and offers the best cost/performance trade-offs. It can also be fine-tuned into an instruction-following model, scoring 8.3 on MT-Bench.
Read more…