Mixtral 8x7B
Open Weights
Mistral AI
Mistral AI's groundbreaking Sparse Mixture of Experts model with 46.7B total parameters. Released December 2023, uses 13B active parameters per token while achieving performance of 6x larger models. Outperforms Llama 2 70B on most benchmarks with 6x faster inference. Matches or exceeds GPT-3.5 across all evaluated benchmarks. Features 32K context window with sliding window attention enabling theoretical 128K token span. Best open-weights chatbot model as of December 2023 per MT-Bench. Supports 5 languages (English, French, German, Spanish, Italian). Apache 2.0 license enables unrestricted commercial use.
Strengths
Caveats
Capabilities
Vision
Audio
Video
Tool Use
Resources
No external resources available