Unleashing Mistral Large 2: The Next Generation Open Source Model for AI Research and Commercial Applications

French AI startup Mistral has announced the release of its latest open source model, Mistral Large 2, which boasts 123 billion parameters. While the model is only licensed as “open” for non-commercial research purposes, it can be fine-tuned by third parties for their specific needs. To use it for commercial or enterprise-grade applications, users will need to obtain a separate license from Mistral. Mistral Large 2 offers advanced multilingual capabilities and improved performance in reasoning, code generation, and mathematics. It is considered a GPT-4 class model, closely matching the performance of other leading models like GPT-4o and Llama 3.1-405. Mistral has been aggressive in the AI domain, raising funds, launching task-specific models, and partnering with industry giants.

Mistral Large 2 builds on the success of its predecessor, the original Large model. The new version features a larger context window of 128,000 tokens, matching the capabilities of models like GPT-4o and Llama 3.1. It also supports additional languages, including Portuguese, Arabic, Hindi, Russian, Chinese, Japanese, and Korean. Mistral Large 2 excels in tasks that require large reasoning capabilities or are highly specialized, such as synthetic text generation, code generation, or RAG. On the Multilingual MMLU benchmark, Mistral Large 2 performs on par with Meta’s Llama 3.1-405B while offering significant cost benefits due to its smaller size.

One notable improvement in Mistral Large 2 is its coding capability. The model has been trained on large chunks of code, allowing it to generate code accurately in over 80 programming languages. It outperforms other models like Claude 3.5 Sonnet and Claude 3 Opus on coding benchmarks, closely trailing behind GPT-4o. In terms of mathematics-focused benchmarks, Mistral Large 2 secures the second spot.

Mistral has also addressed the issue of hallucinations in AI models by fine-tuning Mistral Large 2 to be more cautious and selective in its responses. If the model lacks sufficient information to provide an accurate answer, it will transparently inform the user. The model has also improved its instruction-following capabilities, making it better at handling long multi-turn conversations and providing concise answers. Mistral Large 2 is accessible through the company’s API endpoint and various cloud platforms.

Overall, Mistral Large 2 offers a powerful AI model with advanced multilingual capabilities, improved coding performance, and a focus on transparency and instruction-following. It demonstrates Mistral’s commitment to pushing the boundaries of AI and catering to the needs of both research and commercial users.