anthropic Claude Platform Release Notes ·

Claude Platform Enhancements: Output Capping and Refusal Billing

aimedia
feature patch

Claude Platform now supports a `max_tokens` parameter for the advisor tool to reduce latency and costs by capping output. Additionally, users are no longer billed for requests that result in a refusal without any output. These changes are available on the Claude API and aim to improve efficiency and cost management for developers utilizing the advisor tool and handling refusals.

  • Advisor tool supports `max_tokens` parameter for capped output
  • No billing for refusal without output on Claude API
Enhancements (2)
  • Advisor tool supports `max_tokens` parameter for capped output

    The advisor tool now includes a `max_tokens` parameter. This allows developers to cap the advisor model's output per call, which can reduce latency and token costs for specific workloads.

  • No billing for refusal without output on Claude API

    Requests that result in a `stop_reason: "refusal"` without any generated output from Claude will no longer incur charges. This change affects how users are billed for handling refusals.

Read the original announcement →

https://platform.claude.com/docs/en/release-notes/overview