{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "$id": "/schemas/3.1.0-rc.7/core/evaluator-spec.json",
  "title": "Evaluator Spec",
  "description": "Advisory buyer-attached pointer (#5280) declaring HOW produced build_creative variants should be evaluated and ranked. The rank-side of the get_creative_features feature oracle: the chosen form is the SOURCE of feature evaluation that yields creative-feature values for each leaf, `feature_requirement` is an optional hard GATE over those values, and `rank_by` is the RANK ordering over the survivors. Informs the agent's best_of_n recommended/rank and, when the agent advertises creative.supports_evaluator, populates a per-leaf `eval` block (creative-feature-result[]) on the response.\n\nGate-then-rank pipeline (per produced leaf in a best_of_n run): the producing agent (1) evaluates the leaf via the chosen form, obtaining creative-feature values; (2) when `feature_requirement` is present, applies those predicates and DROPS any leaf that fails (internal best_of_n pruning) — survivors only; (3) orders survivors by `rank_by`. The gate operates BEFORE return: it prunes which leaves the agent recommends/returns from its own best_of_n exploration, never an AdCP-layer block of an already-produced billable leaf — what is produced and billed is governed by max_variants/max_creatives/max_spend, not by the evaluator. With no `feature_requirement`, evaluation stays advisory: every produced leaf is returned and billed and ranking is a soft preference, so `rank_by` failures rank lower but are never dropped.\n\nWhen the chosen form is `feature_agent` (or `agent_url`), the seller calls an external get_creative_features-capable agent, so the pointer is subject to the seller's `creative_policy.accepted_verifiers[]` allowlist exactly as #5280 established for provenance verify_agent: the buyer REPRESENTS which on-list agent it used, the seller (verifier-of-record) CALLS it. An off-list agent is rejected with `EVALUATOR_AGENT_NOT_ACCEPTED` (mirrors PROVENANCE_VERIFIER_NOT_ACCEPTED) before any outbound call; an on-list agent that is unreachable or unknown degrades to seller-default ranking with an advisory errors[] note (non-terminal), NOT a failure. Provide exactly one of three forms; an optional `feature_requirement` gate, an optional `rank_by` ordering, an optional `feature_agent` allowlisted pointer, and a shared optional eval_budget apply to whichever form is chosen.\n\nExperimental (x-status: experimental): this whole evaluator surface — the `evaluator` input, the per-leaf `eval` response block, and `creative.supports_evaluator` — is new and not yet field-tested across parties. Sellers that implement it MUST list `creative.evaluator` in `experimental_features`. Reserved 3.x follow-ons that may reshape these fields: `list_evaluators` discovery (resolves `evaluator_id`), a separate `supports_evaluator_gate` capability, and a hard MUST-enforce-gate semantic. Per docs/reference/experimental-status, it MAY change between 3.x releases with notice.",
  "x-status": "experimental",
  "type": "object",
  "properties": {
    "feature_requirement": {
      "type": "array",
      "description": "Optional hard GATE over creative-feature values — the predicates a leaf MUST satisfy for the producing agent to recommend/return it. Reuses the feature-requirement shape (min_value/max_value for quantitative features like creative_quality_score, allowed_values for binary/categorical) — the same predicate vocabulary that gates property/audience filters, which its own schema names as an intended creative-gate reuse. A leaf that fails any predicate is DROPPED from the agent's best_of_n survivors before ranking — this is internal pruning of which leaves the agent recommends, not an AdCP-layer block of an already-produced billable leaf (what is produced/billed is governed by max_variants/max_creatives/max_spend). Distinct from `rank_by`: the gate is a pass/fail predicate (drop on fail), `rank_by` is an ordering over survivors. Each predicate's `if_not_covered` (exclude|include, default exclude) is the fail-open knob when the source cannot measure that feature. A pass/warn/fail verdict is expressed as a categorical string feature value gated via `allowed_values` (e.g. [\"pass\"] or [\"pass\",\"warn\"]) — the buyer's predicate decides whether warn passes; the verdict is derived, never stored on creative-feature-result. Omit to leave evaluation advisory (no leaf is dropped).",
      "items": {
        "$ref": "/schemas/3.1.0-rc.7/core/feature-requirement.json"
      },
      "minItems": 1
    },
    "rank_by": {
      "type": "array",
      "description": "Optional RANK ordering over creative-feature values — an ordered list (most significant first) the agent uses to order the gate survivors into recommended/rank. An explicit {feature_id, direction} ordering rather than the feature-requirement predicate (which has no sort direction): the gate decides pass/fail, rank_by decides better/worse. Soft preference, never a gate: leaves are not dropped by rank_by, they are only ordered. Omit to let the evaluator/seller choose the ordering.",
      "items": {
        "type": "object",
        "properties": {
          "feature_id": {
            "type": "string",
            "description": "Creative feature to order by (discovered via get_adcp_capabilities; the same feature_id space the chosen evaluator form returns in eval.features[])."
          },
          "direction": {
            "type": "string",
            "enum": ["maximize", "minimize"],
            "default": "maximize",
            "description": "Sort direction for this feature: `maximize` ranks higher feature values first (e.g. creative_quality_score), `minimize` ranks lower values first (e.g. predicted_cpa)."
          }
        },
        "required": ["feature_id"],
        "additionalProperties": false
      },
      "minItems": 1
    },
    "feature_agent": {
      "type": "object",
      "description": "Optional buyer-attached pointer to a get_creative_features-capable creative-feature / governance agent the producing agent calls to evaluate each leaf (the gate's SOURCE of feature values). This is the buyer-represents → seller-calls pattern #5280 established for provenance, generalized to the evaluator gate: the buyer REPRESENTS which agent it used, but the seller is the verifier-of-record and decides which agent it actually calls. `agent_url` MUST appear (canonicalized per /docs/reference/url-canonicalization) in the seller's `creative_policy.accepted_verifiers[].agent_url`; an off-list agent is rejected with `EVALUATOR_AGENT_NOT_ACCEPTED` (mirrors PROVENANCE_VERIFIER_NOT_ACCEPTED) before any outbound call. Reuses the same allowlist mechanism — no new allowlist is introduced. Distinct from the `agent_url` oneOf form, which names the evaluator's source directly; `feature_agent` attaches the gate's measurement source alongside any of the three forms.",
      "properties": {
        "agent_url": {
          "type": "string",
          "format": "uri",
          "pattern": "^https://",
          "description": "URL of the get_creative_features-capable agent the producing agent calls to obtain creative-feature values for the gate. MUST use https:// and MUST match an entry in the seller's `creative_policy.accepted_verifiers[].agent_url`; off-list → `EVALUATOR_AGENT_NOT_ACCEPTED`."
        },
        "feature_id": {
          "type": "string",
          "description": "Optional canonical feature_id the producing agent SHOULD request against this agent. When present it SHOULD match the agent's `accepted_verifiers[].feature_id` or be omitted; when absent the seller selects a feature at evaluation time. Resolves selector ambiguity exactly as the provenance verify_agent.feature_id does."
        }
      },
      "required": ["agent_url"],
      "additionalProperties": false
    },
    "eval_budget": {
      "type": "object",
      "description": "Optional soft ceiling on evaluation effort. Advisory in v1 with no billing coupling. Well-known soft fields max_calls / max_seconds; open for evaluator-specific knobs.",
      "properties": {
        "max_calls": {
          "type": "integer",
          "minimum": 1,
          "description": "Soft cap on the number of judge calls the evaluator should make."
        },
        "max_seconds": {
          "type": "number",
          "minimum": 0,
          "description": "Soft cap on wall-clock seconds the evaluation should consume."
        }
      },
      "additionalProperties": true
    },
    "ext": {
      "$ref": "/schemas/3.1.0-rc.7/core/ext.json"
    }
  },
  "oneOf": [
    {
      "type": "object",
      "description": "Exemplar form: inline pass/fail examples that CALIBRATE a single agent-defined prediction feature — conventionally `predicted_performance`, a number in [0,1] — that the evaluator computes per leaf and returns as one creative-feature-result entry in eval.features[]. Exemplars are calibration, not a separate ranking path (mirroring content-standards.json calibration_exemplars): the pass/fail artifacts anchor where the prediction sits between 0 (like the fail set) and 1 (like the pass set). `rank_by` then orders survivors on that one feature_id (e.g. {feature_id: predicted_performance, direction: maximize}). Artifact-based — a creative variant is content to be measured. Like all three forms, this resolves to \"produce a feature value, rank by it.\"",
      "required": [
        "exemplars"
      ],
      "properties": {
        "exemplars": {
          "type": "object",
          "description": "Pass/fail examples that calibrate the single prediction feature (e.g. `predicted_performance` in [0,1]) the evaluator returns in eval.features[]. Artifact-based, mirroring content-standards.json calibration_exemplars.",
          "properties": {
            "pass": {
              "type": "array",
              "items": {
                "$ref": "/schemas/3.1.0-rc.7/content-standards/artifact.json"
              },
              "description": "Artifacts exemplifying variants the buyer considers good — the high end (≈1) of the calibrated prediction feature."
            },
            "fail": {
              "type": "array",
              "items": {
                "$ref": "/schemas/3.1.0-rc.7/content-standards/artifact.json"
              },
              "description": "Artifacts exemplifying variants the buyer considers bad — the low end (≈0) of the calibrated prediction feature."
            }
          }
        }
      },
      "additionalProperties": true
    },
    {
      "type": "object",
      "description": "Identifier form: an account-scoped house evaluator that produces creative-feature values for each leaf. Its discovery surface, `list_evaluators`, is a committed 3.x follow-on; until it lands, `evaluator_id` resolves out-of-band by prior buyer/seller arrangement — which is why this whole evaluator surface is x-status: experimental (see the schema-root note).",
      "required": [
        "evaluator_id"
      ],
      "properties": {
        "evaluator_id": {
          "type": "string",
          "x-entity": "evaluator",
          "description": "Account-scoped house evaluator selected by the buyer. Discovery via `list_evaluators` is a committed 3.x follow-on; until it lands, this resolves out-of-band by prior buyer/seller arrangement (the reason the surface is experimental). An unknown id degrades to seller-default ranking (advisory errors[] note), not a failure."
        }
      },
      "additionalProperties": true
    },
    {
      "type": "object",
      "description": "Agent form: an external judge agent the seller calls via the get_creative_features contract (/docs/governance/creative/get_creative_features) to produce creative-feature values for each leaf — the SAME contract and the same results[] (creative-feature-result[]) shape get_creative_features returns. One creative-feature agent contract spans gate, rank, and provenance. Subject to the same `creative_policy.accepted_verifiers[]` allowlist as `feature_agent`: an off-list agent_url is rejected with `EVALUATOR_AGENT_NOT_ACCEPTED` before any outbound call. eval_budget bounds the calls; calls_used/seconds_used on the leaf eval block report what was spent (reuse the oracle's vendor_cost/consumption for cost).",
      "required": [
        "agent_url"
      ],
      "properties": {
        "agent_url": {
          "type": "string",
          "format": "uri",
          "pattern": "^https://",
          "description": "URL of an external get_creative_features-capable judge agent the seller calls to score the produced leaves. MUST match an entry in the seller's `creative_policy.accepted_verifiers[].agent_url` (off-list → `EVALUATOR_AGENT_NOT_ACCEPTED`); an on-list agent that is unreachable degrades to seller-default ranking (advisory errors[] note), not a failure."
        }
      },
      "additionalProperties": true
    }
  ],
  "additionalProperties": true
}