Mamba, a new model for sequence learning, outperforms larger Transformer models in language, audio, and genomics tasks. Its selective structured state space models (SSMs) filter irrelevant data, while its hardware-aware algorithm optimizes for modern GPUs. Mamba’s simplified architecture, which integrates selective SSMs and eliminates attention and MLP blocks, enhances scalability and performance. The model’s code and pre-trained versions are available on GitHub.