DeepReinforce unveils Ornith-1.0 open-source AI model showcasing advanced flight simulation results with precise, realistic b

Editorial illustration for DeepReinforce releases Ornith-1.0 open-source model with state‑of‑the‑art results

DeepReinforce releases Ornith-1.0 open-source model with...

DeepReinforce releases Ornith-1.0 open-source model with state‑of‑the‑art results

By AI Daily Post Edited by Brian Petersen, Editor-in-Chief

June 25, 2026 • 2 min read

DeepReinforce has just put Ornith‑1.0 on the table, an open‑source family of coding agents that claims to think differently about how a model interacts with its own prompt scaffold. The lineup runs from a 9 billion‑parameter dense model up to a 397 billion‑parameter mixture‑of‑experts flagship, all released under an MIT license on Hugging Face. Built on top of pretrained Gemma 4 and Qwen 3.5, each checkpoint comes with FP8 and GGUF builds for faster local serving.

What sets Ornith‑1.0 apart is that, instead of relying on a fixed, human‑crafted harness, the model learns to write its own during reinforcement learning, jointly optimizing the scaffold and the code solution. The research team notes that this approach yields state‑of‑the‑art results among open models of comparable size. To keep the system honest, three guardrails—a fixed trust boundary, a deterministic monitor, and a frozen LLM judge—stand between the model and its reward signal, aiming to curb reward‑hacking.

In short, Ornith‑1.0 is positioned as a reasoning‑first coding model, opening each reply with a `` block before delivering the final answer.

The DeepReinforce research team reports state-of-the-art results among open models of comparable size.
TL;DR

Ornith-1.0 ships in 9B, 31B, 35B-MoE, and 397B-MoE sizes under MIT, built on Gemma 4 and Qwen 3.5.

The model learns its own scaffold during RL, jointly optimizing the harness and the solution.

Ornith-1.0-397B tops Claude Opus 4.7 on both headline benchmarks, but not Opus 4.8 or the larger GLM-5.2-744B.

Three layers -- fixed trust boundary, deterministic monitor, frozen LLM judge -- guard against reward hacking.

What is Ornith-1.0?

Ornith-1.0 is a set of reasoning models tuned for coding agents.

DeepReinforce Releases Ornith-1.0: An Open-Source Coding Model Family That Learns Its Own RL Scaffolds - MarkTechPost

Why this matters

Ornith‑1.0 adds four new open‑source models to the coding‑assistant space, ranging from a 9 B dense checkpoint to a 397 B mixture‑of‑experts flagship, all under an MIT licence on Hugging Face. The family builds on Gemma 4 and Qwen 3.5, and DeepReinforce’s team claims state‑of‑the‑art results among open models of comparable size. For developers, the permissive licence means we can experiment without legal friction, and the variety of scales lets teams match compute budgets to project needs.

Founders may see a drop‑in option for internal tooling, yet the claim of “state‑of‑the‑art” performance is limited to the authors’ own benchmarks; it is unclear whether those results hold across diverse real‑world codebases. Researchers gain a publicly available MoE at 397 B, which could serve as a testbed for agentic coding experiments, but the post‑training on proprietary bases raises questions about reproducibility. In short, Ornith‑1.0 expands the open‑source coding model toolkit, but its practical impact will depend on broader validation and community uptake.

DeepReinforce releases Ornith-1.0 open-source model with...

TL;DR

What is Ornith-1.0?

Further Reading

Latest News

Calibration uses NVIDIA Triton Llama-3-8B A10 and vLLM Qwen2.5-7B RTX 4090 data

Meta says AI moderators make 13% fewer errors than humans, defends rollout speed

NVIDIA TensorRT Enables Context Parallelism for Multi‑GPU AI Inference