AI Model Catalog | Microsoft Foundry Models

computer-use-preview

Version: 2025-03-11

OpenAI•Last updated March 2025

computer-use-preview is the model for Computer Use Agent for use in Responses API. You can use computer-use-preview model to get instructions to control a browser on your computer screen and take action on a user's behalf.

Multipurpose

Multilingual

Multimodal

computer-use-preview or Computer-Using Agent (CUA) is a model that combines GPT‑4o's vision capabilities with advanced reasoning through reinforcement learning. CUA is trained to interact with graphical user interfaces (GUIs)—the buttons, menus, and text fields people see on a screen—just as humans do. This gives it the flexibility to perform digital tasks without using OS- or web-specific APIs. By combining advanced GUI perception with structured problem-solving, it can break tasks into multi-step plans and adaptively self-correct when challenges arise. This capability marks the next step in AI development, allowing models to use the same tools humans rely on daily and opening the door to a vast range of new applications. CUA in the API does not operate computers or browsers. Applications send to CUA screenshots of a computer along with instructions, and the CUA model responds with the actions for the application to take, such as navigating and clicking the pointer, and entering text. Core capabilities

Model Qualities: While CUA is still early and has limitations, it sets new state-of-the-art benchmark results, achieving a 38.1% success rate on OSWorld for full computer use tasks, and 58.1% on WebArena and 87% on WebVoyager for web-based tasks. These results highlight CUA’s ability to navigate and operate across diverse environments using a single general action space.
Safety: CUA has been extensively tested for safety, and implements safeguards across several dimensions. CUA refuses many harmful tasks and illegal or regulated activities, is trained to ask users for confirmation before finalizing tasks with external side effects, and is designed to identify and ignore prompt injections on websites.

Model Provider

This model is provided through the Azure OpenAI service.

Relevant documents

The following documents are applicable:

Responsible AI Considerations

Built-in safety measures - Safety is built into our models from the beginning, and reinforced at every step of our development process. In pre-training, we filter out information that we do not want our models to learn from or output, such as hate speech, adult content, sites that primarily aggregate personal information, and spam. In post-training, we align the model's behavior to our policies using techniques such as reinforcement learning with human feedback (RLHF) to improve the accuracy and reliability of the models' responses. computer-use-preview or CUA implements additional safeguards across several dimensions. CUA refuses many harmful tasks and illegal or regulated activities, is trained to ask users for confirmation before finalizing tasks with external side effects, and is designed to identify and ignore prompt injections on websites. Details on CUA model are covered in the Operator System card. We'll continue to monitor how computer-use-preview is being used and improve the model's safety as we identify new risks.

Content Filtering

Prompts and completions are passed through a default configuration of Azure AI Content Safety classification models to detect and prevent the output of harmful content. Learn more about Azure AI Content Safety . Additional classification models and configuration options are available when you deploy an Azure OpenAI model in production; learn more .

Model Specifications

Context Length131072

LicenseCustom

Training DataOctober 2023

Last UpdatedMarch 2025

Input TypeText,Image

Output TypeText

PublisherOpenAI

Languages27 Languages

Quick Start