• Published on

    Microsoft has developed a new series of open language models called Phi-3.

    The Phi-3 models were trained using a dataset of textbook-style content and synthetically generated data, rather than raw web data typically used for large language models.

    The first model being released is Phi-3-mini, which has 3.8 billion parameters. Phi-3-mini is instruction-tuned and available in two context-length variants — 4K and 128K tokens. It is the first model in its class to support a context window of up to 128K tokens, with little impact on quality.

    The Phi-3 models are positioned for tasks like writing summaries, content generation, and answering straightforward queries.

    Other Phi-3 models include Phi-3-small (7 billion parameters) and Phi-3-medium (14 billion parameters).

    The family also includes a multimodal model, Phi-3 Vision Instruct. With a 128K context length, 4.2B parameters and contains image encoder, connector, projector, and Phi-3 Mini language model. Training data: 500B vision and text tokens.