Skip to main content
Fine-tuning

Run a grader

POST
/openai/v1/fine_tuning/alpha/graders/run

Runs the specified grader and returns its evaluation results.

Request BodyContent-Type: application/json
graderOpenAI.GraderStringCheck | OpenAI.GraderTextSimilarity | OpenAI.GraderPython | OpenAI.GraderScoreModel | OpenAI.GraderMultirequired
The grader used for the fine-tuning job.
One of the following:
OpenAI.GraderStringCheck
A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.
OpenAI.GraderTextSimilarity
A TextSimilarityGrader object which grades text based on similarity metrics.
OpenAI.GraderPython
A PythonGrader object that runs a python script on the input.
OpenAI.GraderScoreModel
A ScoreModelGrader object that uses a model to assign a score to the input.
OpenAI.GraderMulti
A MultiGrader object combines the output of multiple graders to produce a single score.
itemOpenAI.RunGraderRequestItem
The dataset item provided to the grader. This will be used to populate the item namespace. See the guide for more details.
model_samplestringrequired
The model sample to be evaluated. This value will be used to populate the sample namespace. See the guide for more details. The output_json variable will be populated if the model sample is a valid JSON string.
Responses
rewardnumberrequired
metadataOpenAI.RunGraderResponseMetadatarequired
namestringrequired
typestringrequired
errorsOpenAI.RunGraderResponseMetadataErrorsrequired
execution_timenumberrequired
scoresobjectrequired
token_usageinteger | nullrequired
sampled_model_namestring | nullrequired
sub_rewardsobjectrequired
model_grader_token_usage_per_modelobjectrequired
Request
curl -X POST https://api.openai.com/v1/fine_tuning/alpha/graders/run \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "grader": {
      "type": "score_model",
      "name": "Example score model grader",
      "input": [
        {
          "role": "user",
          "content": [
            {
              "type": "input_text",
              "text": "Score how close the reference answer is to the model answer on a 0-1 scale. Return only the score.

Reference answer: {{item.reference_answer}}

Model answer: {{sample.output_text}}"
            }
          ]
        }
      ],
      "model": "gpt-5-mini",
      "sampling_params": {
        "temperature": 1,
        "top_p": 1,
        "seed": 42
      }
    },
    "item": {
      "reference_answer": "fuzzy wuzzy was a bear"
    },
    "model_sample": "fuzzy wuzzy was a bear"
  }'
Response
{
  "reward": 1.0,
  "metadata": {
    "name": "Example score model grader",
    "type": "score_model",
    "errors": {
      "formula_parse_error": false,
      "sample_parse_error": false,
      "truncated_observation_error": false,
      "unresponsive_reward_error": false,
      "invalid_variable_error": false,
      "other_error": false,
      "python_grader_server_error": false,
      "python_grader_server_error_type": null,
      "python_grader_runtime_error": false,
      "python_grader_runtime_error_details": null,
      "model_grader_server_error": false,
      "model_grader_refusal_error": false,
      "model_grader_parse_error": false,
      "model_grader_server_error_details": null
    },
    "execution_time": 4.365238428115845,
    "scores": {},
    "token_usage": {
      "prompt_tokens": 190,
      "total_tokens": 324,
      "completion_tokens": 134,
      "cached_tokens": 0
    },
    "sampled_model_name": "gpt-4o-2024-08-06"
  },
  "sub_rewards": {},
  "model_grader_token_usage_per_model": {
    "gpt-4o-2024-08-06": {
      "prompt_tokens": 190,
      "total_tokens": 324,
      "completion_tokens": 134,
      "cached_tokens": 0
    }
  }
}