RightNow AI's AutoKernel, an open-source GPU optimizer for PyTorch models, shown on a screen.

Editorial illustration for RightNow AI Unveils AutoKernel: Open-Source GPU Optimizer for PyTorch Models

AutoKernel: Open-Source GPU Optimizer for PyTorch

By AI Daily Post Edited by Brian Petersen, Editor-in-Chief

April 7, 2026 • Updated: July 4, 2026 • 3 min read

Not every kernel deserves your attention. That’s the cold, hard truth AutoKernel embraces from the start. RightNow AI’s new open-source optimizer ignores the vanity of making every function equally fast.

Instead, it begins with a full PyTorch model, runs torch.profiler alongside shape recording, and crunches the numbers. It maps each kernel’s GPU time against the whole. Then comes the ruthless logic of Amdahl’s law: optimize the 60% runtime hog, and a 1.5× local speedup yields a real 1.25× end-to-end win.

Apply that same effort to a 5% sliver , and you barely move the needle, netting just 1.03×. AutoKernel doesn’t waste cycles on the trivial. It hunts where it hurts.

The RightNow AI research team has released AutoKernel, an open-source framework that applies an autonomous LLM agent loop to GPU kernel optimization for arbitrary PyTorch models.

RightNow AI Releases AutoKernel: An Open-Source Framework that Applies an Autonomous Agent Loop to GPU Kernel Optimization for Arbitrary PyTorch Models - MarkTechPost

AutoKernel turns the spotlight on what truly drives performance: the costly kernels that hog the clock. By applying Amdahl’s razor, it cuts through the noise. Developers no longer chase phantom gains; they invest optimization effort where the returns are real, on the 60% runtime hogs, not the 5% distractions.

This is not just a tool. It’s a shift in mindset, from guesswork to measurement, from scattered patches to surgical precision. Every GPU cycle now counts, and the speedup is no longer a promise; it’s a calculation.

Common Questions Answered

How does AutoKernel differ from previous GPU optimization approaches?

Unlike previous methods that focus on individual kernels, AutoKernel takes a full-model view using torch.profiler to capture comprehensive GPU timing data. The framework applies Amdahl's law to prioritize optimization targets, ensuring the most impactful kernels are addressed first for maximum performance gains.

What makes AutoKernel unique for PyTorch model optimization?

AutoKernel is a fully open-source tool that automates GPU kernel optimization without requiring specialized GPU programming expertise. By processing an entire PyTorch model overnight, it generates Triton kernels that can significantly improve performance, using shape-specific execution times to guide its optimization strategy.

How does AutoKernel use torch.profiler in its optimization process?

AutoKernel leverages torch.profiler with shape recording to capture detailed timing data for every GPU operation in a PyTorch model. This granular approach allows the framework to create a comprehensive snapshot of model performance, which is then used to rank and optimize the most critical kernels for maximum speedup.

Ship an AI product this weekend — no engineers required.

Structured, in-depth lessons on the exact no-code tools — not scattered tutorials.

The exact platforms, taught in depth
Build real, working projects
Our honest review + a reader discount

Read the review →

AutoKernel: Open-Source GPU Optimizer for PyTorch

Common Questions Answered

How does AutoKernel differ from previous GPU optimization approaches?

What makes AutoKernel unique for PyTorch model optimization?

How does AutoKernel use torch.profiler in its optimization process?

Further Reading

Ship an AI product this weekend — no engineers required.

Latest News

OpenAI's Miles Wang in Talks for USD 2B AI Drug Discovery Startup

Mistral Vibe for Code Leads in Multi-Agent Programming Benchmark

OpenAI's First Hardware Device Is a Movable, Screenless Speaker

PrismML's Bonsai 27B Runs Qwen3.6 on Laptops With 1-bit and Ternary Builds

OpenAI Targets 2027 for First Major Hardware: A ChatGPT Speaker

Publishers sue Google over unauthorized AI book training

Anthropic's Claude for Teachers Vows Not to Train on Student Data

DeepSeek Seeks More Capital Weeks After USD 7B Funding Round

Anthropic's New AI Ad Campaign Draws Criticism for 'Creepy' Tactics

DeepMind CEO proposes independent AI regulator as White House advisor voices skepticism

Related Reading

Trump cracks down on Anthropic after Amazon tip; staff largely foreign

SDOF Adds Two Defensive Layers via Intent Router and StateAwareDisp

D&B rebuilds 642 million‑business database after AI agents hit limits

OpenAI insiders distrust Sam Altman as vows policies while AI outperforms humans

IRGC threatens OpenAI's Abu Dhabi data center if US attacks its power plants

Common Questions Answered

How does AutoKernel differ from previous GPU optimization approaches?

What makes AutoKernel unique for PyTorch model optimization?

How does AutoKernel use torch.profiler in its optimization process?

Further Reading

Ship an AI product this weekend — no engineers required.

Latest News

OpenAI's Miles Wang in Talks for USD 2B AI Drug Discovery Startup

Mistral Vibe for Code Leads in Multi-Agent Programming Benchmark

OpenAI's First Hardware Device Is a Movable, Screenless Speaker

PrismML's Bonsai 27B Runs Qwen3.6 on Laptops With 1-bit and Ternary Builds

OpenAI Targets 2027 for First Major Hardware: A ChatGPT Speaker

Publishers sue Google over unauthorized AI book training

Anthropic's Claude for Teachers Vows Not to Train on Student Data

DeepSeek Seeks More Capital Weeks After USD 7B Funding Round

Anthropic's New AI Ad Campaign Draws Criticism for 'Creepy' Tactics

DeepMind CEO proposes independent AI regulator as White House advisor voices skepticism