aws AWS What's New · Jun 3, 2026

Amazon SageMaker AI adds multi-turn reinforcement learning

aiawsengineeraws-ec2aws-eksaws-sagemakeraws-bedrock

feature

Amazon SageMaker AI now supports multi-turn reinforcement learning for customizing foundation models on multi-step, agentic tasks. This serverless capability simplifies the complex process of training AI agents by rewarding full decision sequences, enabling specialization of smaller models for specific workloads. It is available today through SageMaker Studio and the Python SDK, with support for various models and AWS compute options.

→New multi-turn reinforcement learning for AI agent customization
→Simplified agent training and management
→Integrated tracking and evaluation tools
→Serverless operation and cost efficiency
→Availability and supported models

Features (1) ›

New multi-turn reinforcement learning for AI agent customization

SageMaker AI introduces multi-turn reinforcement learning (RL), a serverless technique to fine-tune models for multi-step agent tasks. This feature trains models against custom agent environments by rewarding the complete sequence of decisions an agent makes, facilitating the specialization of smaller models to match larger ones on target workloads.

Enhancements (3) ›

Simplified agent training and management

SageMaker's Multi-turn RL handles the full training loop, including rollout orchestration, trajectory collection, and checkpoint management, eliminating the need for custom infrastructure. Users can connect their agents running on various AWS services or custom infrastructure.
Integrated tracking and evaluation tools

The offering includes built-in MLflow tracking to inspect agent trajectories, rewards, and traces. Evaluation jobs provide key metrics like reward, pass@k, and trajectory metrics for pre-deployment benchmarking.
Serverless operation and cost efficiency

Multi-turn RL operates as a fully serverless capability, meaning users only pay for tokens processed without needing to provision or manage infrastructure, making it a cost-effective solution.

Notes (1) ›

Availability and supported models

Multi-turn RL is available now via SageMaker Studio and the SageMaker Python SDK. Specific model support varies by region, with models like Qwen 3.6 27B, Nova Lite 2.0, GPT-OSS-20B, and Gemma 31B listed for us-west-2 and us-east-1.

Read the original announcement →

https://aws.amazon.com/about-aws/whats-new/2026/06/multi-turn-reinforcement-learning-on-sagemaker-ai/

Related releases

CloudWatch adds managed Prometheus collectors AWS What's New · 9h ago
Amazon EC2 C7i Instances Expand to New Regions AWS What's New · 11h ago
EC2 C7i-flex instances now available in Europe (Milan) region AWS What's New · 11h ago
AWS CodeDeploy Expands to Five New Regions AWS What's New · 14h ago
EC2 Auto Scaling Instance Refresh integrated with CloudFormation AWS What's New · 2d ago
EC2 Dedicated Hosts support host resource groups without self-managed licenses AWS What's New · Jul 24, 2026