Ben's Bites Newsletter
Posts
Mistral AI drops its new model with mixture of experts.

Mistral AI drops its new model with mixture of experts.

December 11, 2023

Mistral AI has released Mixtral 8x7B, an open-source sparse mixture-of-experts model with 45B parameters that achieves state-of-the-art performance among open models. Mistral models are now available via its dev platform and API in beta.

What's going on here?

French AI startup Mistral releases a new open-source model.

What does this mean?

Mixtral matches or exceeds LLaMA 2 70B and GPT-3.5 on most NLP benchmarks, with 6x faster inference speed than LLaMA models. Mixtral uses a “mixture of experts” network that selectively uses only 12B of its 45B total parameters per inference, keeping costs similar to a 12B model. [Mostly everyone believes] that GPT-4 is also a mixture of experts.

Relevant links: magnet for the model [1], Mixtral on Replicate [2], Mixtral fine-tuned for chat [3].

Mistral now also has API endpoints and a platform for these models. There are 3 generative endpoints (Mistral-tiny, small and medium) and one embeddings endpoint (Mistral-embed). Mistral medium runs on another unreleased model with higher performance.

Why should I care?

Just like their first model, Mistral realised this one by tweeting a magnet to download the model. But another factor that boosted Mistral’s launch was Google’s edited demo of Gemini without any access to the models.

Mistral also raised at a $2B valuation just a few days back. Reminder: the startup was formed this year in June. It’s releasing the best-in-the-class open-source models for everyone to use. Along with Llama, these models are gearing to take OpenAI’s market share.

Reply

or to participate.