Phi-3 has now expanded into a family of six separate models with different parameters and context lengths. The costs range from $0.0003 to $0.0005 per 1,000 input tokens. However, when considering the more typical “per million” token pricing, it is double the price of OpenAI’s GPT-4o mini for input tokens and 1.5 times more expensive for output tokens. Despite the cost, Phi-3 was designed to be safe for enterprise use, with measures in place to reduce bias and toxicity. It can also be fine-tuned for specific enterprise use cases.
Previously, fine-tuning Phi-3 required developers to set up their own Microsoft Azure server or run the model on their local machine. However, Microsoft has now made “Models-as-a-Service” available in its Azure AI development platform, allowing developers to use Phi-3 via a serverless endpoint without managing the underlying infrastructure. Phi-3-vision, which handles imagery inputs, will also soon be available through a serverless endpoint.
While developers can build apps using the available models, they cannot create their own versions tuned to their specific use cases. For this purpose, Microsoft recommends using the Phi-3-mini and Phi-3-medium models, which can be fine-tuned with third-party data. Microsoft cites Khan Academy as an example, as they use a fine-tuned Phi-3 model to benchmark the performance of their educational software.
The pricing for serverless fine-tuning of Phi-3-mini-4k-instruct starts at $0.004 per 1,000 tokens. This move by Microsoft poses competition to OpenAI, as they recently announced free fine-tuning of their GPT-4o mini model for certain users. With Meta and Mistral also releasing their own fine-tunable AI models, the race to offer compelling options for enterprise development is in full swing. AI providers are actively courting developers with models of various sizes to cater to different needs.