computer-use-preview
computer-use-preview is the model for Computer Use Agent for use in Responses API. You can use computer-use-preview model to get instructions to control a browser on your computer screen and take action on a user's behalf.
Key capabilities
About this model
CUA in the API does not operate computers or browsers. Applications send to CUA screenshots of a computer along with instructions, and the CUA model responds with the actions for the application to take, such as navigating and clicking the pointer, and entering text.Key model capabilities
Core capabilities- Model Qualities: While CUA is still early and has limitations, it sets new state-of-the-art benchmark results, achieving a 38.1% success rate on OSWorld for full computer use tasks, and 58.1% on WebArena and 87% on WebVoyager for web-based tasks. These results highlight CUA's ability to navigate and operate across diverse environments using a single general action space.
- Safety: CUA has been extensively tested for safety, and implements safeguards across several dimensions. CUA refuses many harmful tasks and illegal or regulated activities, is trained to ask users for confirmation before finalizing tasks with external side effects, and is designed to identify and ignore prompt injections on websites.
See Responsible AI for additional considerations for responsible use.
Key use cases
The provider has not supplied this information.Out of scope use cases
CUA cannot reliably ensure human-in-the-loop intervention. Developers will need to be systematically aware of, and defend against, situations where the model can be fooled into executing commands that are harmful to the user or the system, such as downloading malware, leaking credentials, or issuing fraudulent financial transactions. Particular attention should be paid to the fact that screenshot inputs are untrusted by nature and may include malicious instructions aimed at the model.Pricing is based on a number of factors, including deployment type and tokens used. See pricing details here.
Quick facts
Model providerAzure OpenAI
TypeResponses
LifecycleGenerally available (GA)
Input typetext, image
Output typetext
Context window131.072k
Token limits16384 output
PricingView pricing