Skip to main content
Advanced AI model GPT-5.6 Sol outperforming predecessor GPT-5.5 in genomics benchmarking on GeneBench v1, showcasing superior

Editorial illustration for GPT‑5.6 Sol outperforms GPT‑5.5 on GeneBench v1 genomics benchmarks

GPT‑5.6 Sol outperforms GPT‑5.5 on GeneBench v1 genomics...

GPT‑5.6 Sol outperforms GPT‑5.5 on GeneBench v1 genomics benchmarks

2 min read

Why does this matter now? OpenAI is rolling out a limited preview of its GPT‑5.6 family—Sol, Terra and Luna—while the U.S. government watches.

Sol, the flagship, launches with what the company calls its “most robust safety stack to date,” bolstered after weeks of pressure‑testing against real‑world attacks and tighter guards around higher‑risk activity, sensitive cyber requests and repeated misuse. Terra, meanwhile, matches GPT‑5.5’s performance but at half the cost, and Luna promises strong capability at the lowest price point yet. Here’s the thing: access is being granted to a small group of trusted partners, a move coordinated with the administration and disclosed to the government.

OpenAI stresses this isn’t meant to become the permanent model for releases; it’s a short‑term bridge while a cyber Executive Order framework takes shape. The company says broader availability is slated for the coming weeks, once the preview phase wraps up and the safety mechanisms are fully vetted. It’s a cautious step, balancing rapid rollout with the need for hardened safeguards.

GPT‑5.6 Sol also shows broad improvements in biology workflows. On GeneBench v1, which evaluates long-horizon genomics and quantitative-biology analyses, it achieves stronger results than GPT‑5.5 while using fewer tokens. GPT‑5.6 Sol is our most capable model yet for cybersecurity.

It shifts the performance-efficiency frontier for long-horizon security tasks including vulnerability research and exploitation. On ExploitBench², GPT‑5.6 Sol is competitive with Mythos Preview using only ~1/3 of the output tokens. On ExploitGym(opens in a new window)3, a benchmark created by UC Berkeley researchers in collaboration with OpenAI and other frontier labs, GPT‑5.6 Sol, Terra, and Luna models all demonstrate strong improvements in cyber capabilities as we increase reasoning.

Why this matters

Did you notice the new GPT‑5.6 preview? We now have three models: Sol, the flagship; Terra, a balanced workhorse; and Luna, a low‑cost fast option. Terra matches GPT‑5.5 performance while costing half as much, a claim that could matter to budget‑conscious teams.

Luna promises strong capability at the lowest price point, but real‑world tests will reveal whether speed compromises depth. Sol arrives with the most extensive safety stack the company has built, adding protections for high‑risk activity, sensitive cyber queries and repeated misuse after weeks of probing. In biology, Sol beats GPT‑5.5 on GeneBench v1, delivering better results with fewer tokens, suggesting tighter integration of quantitative‑biology reasoning.

The model also touts being the most capable for cybersecurity, shifting the per‑request cost dynamics, though the exact impact on existing security pipelines remains unclear. As developers, we should weigh the advertised cost savings against the need for independent validation, especially where safety and domain‑specific accuracy are non‑negotiable. Our next steps will involve sandbox trials to confirm whether these headline numbers translate into practical advantage.

Further Reading