The latest release of Ollama, version v0.1.16, adds support for Mixtral and other models based on the Mixture of Experts (MoE) architecture. This update includes new models like Mixtral, a high-quality mixture of experts model, and Dolphin Mixtral, an uncensored model optimized for coding tasks. It’s important to note that these models require at least 48GB of memory. Additionally, a fix was implemented for an issue related to load_duration
in the /api/generate
response. For full details, visit the GitHub release page.