anthropic Claude Platform Release Notes · Jun 2, 2026

Claude Platform Enhancements: Output Capping and Refusal Billing

aimedia

feature patch

Claude Platform now supports a `max_tokens` parameter for the advisor tool to reduce latency and costs by capping output. Additionally, users are no longer billed for requests that result in a refusal without any output. These changes are available on the Claude API and aim to improve efficiency and cost management for developers utilizing the advisor tool and handling refusals.

→Advisor tool supports `max_tokens` parameter for capped output
→No billing for refusal without output on Claude API

Enhancements (2) ›

Advisor tool supports `max_tokens` parameter for capped output

The advisor tool now includes a `max_tokens` parameter. This allows developers to cap the advisor model's output per call, which can reduce latency and token costs for specific workloads.
No billing for refusal without output on Claude API

Requests that result in a `stop_reason: "refusal"` without any generated output from Claude will no longer incur charges. This change affects how users are billed for handling refusals.

Read the original announcement →

https://platform.claude.com/docs/en/release-notes/overview