Team of Amazon engineers analyzing and optimizing Anthropic AI models to reduce token costs through distillation techniques i

Editorial illustration for Amazon engineers distill Anthropic models to lower costs before token pricing

Amazon engineers distill Anthropic models to lower costs...

Amazon engineers distill Anthropic models to lower costs before token pricing

By AI Daily Post Edited by Brian Petersen, Editor-in-Chief

June 29, 2026 • 2 min read

Why does this matter? Amazon’s engineers are quietly re‑training Anthropic’s Claude models into smaller, cheaper versions before the cloud‑provider shifts to token‑based pricing next year. Distillation, the technique they’re using, lets a compact model learn from the outputs of a larger one, trimming compute needs without a full rebuild. A source familiar with the arrangement says Amazon has the right to run this process, much like Apple’s deal with Google Gemini.

While Bedrock already offers a distillation service, it currently supports Amazon’s Nova and Meta’s Llama models—not Claude. The move appears tied to a renegotiated partnership, as Amazon prepares for a pricing model that charges per token rather than per compute hour, a change the company’s spokesperson claims won’t raise costs. Anthropic counters, pointing to lower prices relative to performance.

But here’s the reality: Amazon is also looking at OpenAI and its own Nova line as alternatives. The tech giant has poured up to $25 billion more into Anthropic and up to $50 billion into OpenAI this year, underscoring how high the stakes have become.

Amazon has certain rights to use Anthropic's models for this purpose, according to a person familiar with the matter, similar to Apple's arrangement with Google Gemini. Amazon does offer a distillation service on its Bedrock cloud platform, but Anthropic's Claude models aren't available there; only Amazon's own Nova models and Meta's Llama models are supported. The effort ties back to a renegotiation of the partnership, according to The Information.

Starting next year, Amazon will pay for Anthropic's models based on tokens processed rather than compute hours, which could push costs up sharply. An Amazon spokesperson pushed back, saying the changes from the expanded partnership won't raise costs. Anthropic points to lower prices relative to the performance its models deliver.

Amazon is reportedly exploring alternatives like OpenAI and its own Nova models.

Amazon engineers are reportedly distilling Anthropic models to cut costs before new token-based pricing kicks in - THE DECODER

Why this matters We see Amazon engineers already applying model distillation to Anthropic’s Claude family, a move that could shave dollars off internal workloads before the announced token‑pricing regime takes effect. By training smaller nets on the outputs of larger, proprietary models, the company leverages a right it reportedly holds, much like Apple’s side‑deal with Google Gemini. Yet the practice remains confined to Amazon’s own Bedrock service; Claude itself is not listed on the platform, so external developers cannot yet benefit.

This raises a question for founders: will the cost advantage stay behind the firewall, or could a downstream offering emerge once the internal experiments prove viable? For researchers, the effort illustrates a pragmatic response to pricing pressure, but it also underscores how access to high‑performing models is increasingly mediated by corporate agreements. It is unclear whether Amazon will open a distilled‑Claude product to the broader market, or if the initiative will simply improve its internal margins.

We will watch for signals that the approach moves beyond internal cost‑cutting toward a publicly usable service.