Muse
Muse is a World and Human Action Model (WHAM), a generative model of gameplay (visuals and/or controller actions).
Key capabilities
About this model
Muse is an autoregressive model that has been trained to predict (tokenized) game visuals and controller actions given a prompt. The resulting model can generate consistent game sequences, and shows evidence of capturing the 3D structure of the game environment, the effects of controller actions, and the temporal structure of the game (up to the model's context length).Key model capabilities
This allows the user to run the model in (a) world modelling mode (generate visuals given controller actions), (b) behavior policy (generate controller actions given @past visuals), or (c) generate both visuals and behavior. Muse can be used in multiple scenarios. The following list illustrates the types of tasks that Muse can be used for:- World Model: Visuals are predicted, given a real starting state and action sequence.
- Behaviour Policy: Given visuals, the model predicts the next controller action.
- Full Generation: The model generates both the visuals and the controller actions a human player might take in the game.
See Responsible AI for additional considerations for responsible use.
Key use cases
This model and accompanying code are intended for academic research purposes only. Muse has been trained on gameplay data from a single game, Bleeding Edge, and is intended to be used to generate plausible gameplay sequences resembling this game.Out of scope use cases
The model is not intended to be used to generate imagery outside of the game Bleeding Edge. Generated images include watermark and provenance metadata. Do not remove the watermark or provenance metadata.Pricing is based on a number of factors, including deployment type and tokens used. See pricing details here.
Quick facts
Model providerMicrosoft
TypeImage to image
LifecycleGenerally available (GA)
Input typeimage
Output typeimage
PricingView pricing