The Superior Performance of Microsoft’s New Orca-Math AI Surpasses Models 10 Times Bigger

March 5, 2024

Microsoft’s new Orca-Math AI has made waves in the world of mathematics, surpassing models that are ten times bigger in size. This breakthrough is exciting news for students and STEM researchers who are looking to enhance their math abilities.

Arindam Mitra, a senior researcher at Microsoft Research and leader of the Orca AI team, recently announced the development of Orca-Math on X, a thread-based platform. Orca-Math is a variant of Mistral’s Mistral 7B model, which itself is a variant of Meta’s Llama 2 model. The primary focus of Orca-Math is to excel in solving math word problems while maintaining a small size for training and inference.

What sets Orca-Math apart is its superior performance compared to models with ten times more parameters. Parameters are numerical settings that guide an AI model in forming connections between words, concepts, and numbers during training. Mitra shared a chart that demonstrates Orca-Math’s success in outperforming most other 7-70 billion parameter-sized AI large language models (LLMs) at the GSM8K benchmark. This benchmark consists of 8,500 math word problems designed to be solvable by a middle-school-aged child.

Despite being a smaller model with 7 billion parameters, Orca-Math competes with and nearly matches the performance of larger parameter models developed by OpenAI and Google. It even surpasses models like MetaMath (70B) and Llemma (34B).

So how did the Orca team achieve such impressive results? They created a new list of 200,000 word problems by collaborating with specialized AI agents, including student AI agents and teacher AI agents that corrected them. These problems were generated by collecting 36,217 math word problems from existing open-source datasets and using OpenAI’s GPT-4 to obtain the answers. The team then trained the Mistral 7B variant, resulting in the development of Orca-Math.

Additionally, the Orca team utilized a “Suggester and Editor” agent to create more complex questions for training the AI. The Suggester proposed methods to enhance the complexity of a problem, while the Editor generated updated and more challenging problems based on these suggestions. This iterative process increased the complexity of the previously generated problems.

To optimize the model’s accuracy in answering math questions, the team employed the Kahneman-Tversky Optimization (KTO) method developed by startup Contextual AI. KTO aligns outputs with desirable or undesirable outcomes without requiring explicit preferences. Mitra revealed that KTO was used alongside supervised fine-tuning to improve the accuracy of Orca-Math’s answers.

Excitingly, the Orca team has made their synthetic dataset of 200,000 word math problems openly available on Hugging Face under a permissive MIT license. This allows everyone, including startups and companies, to explore, build, and innovate using the dataset.

Microsoft initially released the Orca 13B model in June 2023, which used GPT-4 as its AI teacher. They followed up with Orca 2 at 13B and 7B versions in November 2023, based on Meta’s Llama 2 model. With each new addition to the Orca family, Microsoft continues to grow and improve their AI models.

In conclusion, Microsoft’s Orca-Math AI has proven to be a game-changer in solving math word problems. Its small size and superior performance compared to larger models make it an invaluable tool for students and researchers alike. By leveraging advanced techniques such as synthetic data generation and optimization methods, the Orca team has demonstrated the potential to enhance the capabilities of AI models while dispelling concerns about a “model collapse.” With the release of their synthetic dataset, Microsoft is encouraging further innovation and collaboration in the field of AI.

Loading…

Here are the results for the search: "{{td_search_query}}"

No results!

{{post_title}}

RELATED ARTICLES

Iranian Hackers Charged in Campaign to Disrupt U.S. Elections

Microsoft Flight Simulator 2024: Blurring the Lines Between Game and Reality

Microsoft Enhances Windows Recall: Balancing Convenience and Privacy Concerns