computer-use-preview

computer-use-preview

computer-use-preview is the model for Computer Use Agent for use in Responses API. You can use computer-use-preview model to get instructions to control a browser on your computer screen and take action on a user's behalf.
Azure OpenAI
Version: 2025-03-11

Key capabilities

About this model

CUA in the API does not operate computers or browsers. Applications send to CUA screenshots of a computer along with instructions, and the CUA model responds with the actions for the application to take, such as navigating and clicking the pointer, and entering text.

Key model capabilities

Core capabilities
  • Model Qualities: While CUA is still early and has limitations, it sets new state-of-the-art benchmark results, achieving a 38.1% success rate on OSWorld for full computer use tasks, and 58.1% on WebArena and 87% on WebVoyager for web-based tasks. These results highlight CUA's ability to navigate and operate across diverse environments using a single general action space.
  • Safety: CUA has been extensively tested for safety, and implements safeguards across several dimensions. CUA refuses many harmful tasks and illegal or regulated activities, is trained to ask users for confirmation before finalizing tasks with external side effects, and is designed to identify and ignore prompt injections on websites.
See Responsible AI for additional considerations for responsible use.

Key use cases

The provider has not supplied this information.

Out of scope use cases

CUA cannot reliably ensure human-in-the-loop intervention. Developers will need to be systematically aware of, and defend against, situations where the model can be fooled into executing commands that are harmful to the user or the system, such as downloading malware, leaking credentials, or issuing fraudulent financial transactions. Particular attention should be paid to the fact that screenshot inputs are untrusted by nature and may include malicious instructions aimed at the model.
Pricing is based on a number of factors, including deployment type and tokens used. See pricing details here.

Quick facts

Model providerAzure OpenAI
TypeResponses
LifecycleGenerally available (GA)
Input typetext, image
Output typetext
Context window131.072k
Token limits16384 output