Phi-4-Reasoning-Vision-15B

Microsoft

Version: 1

Phi-4-Reasoning-Vision-15B is a broadly capable model that can be used for a wide array of vision-language tasks such as image captioning, asking questions about images, reading documents and receipts, helping with homework, interfering about changes in sequences of images, and much more. Beyond these general capabilities it excels at math and science reasoning and at understanding and grounding elements on computer and mobile screens.

Quick facts

Model providerMicrosoft

TypeChat completion, Visual question answering, Image analysis, Image classification, Image to text

LifecycleGenerally available (GA)

Input typeimage, text

Output typetext

Phi-4-Reasoning-Vision-15B

Quick facts

Quick start