AI News Archive - Browse Page 36 of 160
Browse AI news articles covering LLMs, tools, research, and industry trends
The Vergecast: Tim Cook’s AirPods, Touch Bar legacy, Apple’s next, Xbox returns
The Vergecast is back with a packed agenda, and the episode’s title alone hints at the weight of the conversation.
Project Maven shifts AI from satellite to drone video imagery
When the Department of Defense first earmarked money for an artificial‑intelligence program, the focus was clear: crunch massive troves of satellite...
DeepSeek‑V4‑Pro‑Max tops open models, nears closed results at 1/6 Opus 4.7 cost
Why does a model that costs just a sixth of Claude Opus 4.7 matter? Because price has long been the gatekeeper separating open‑weight research from...
Update: Usage Limits Draining Faster Linked to Two Unrelated Experiments
The recent flurry of complaints about Claude’s usage caps disappearing quicker than users expect has been puzzling many in the open‑source community.
COALA paper defines agent memory types: procedural rules and semantic facts
Building an agent that can act consistently isn’t just about cranking out a clever prompt.
85% of firms run AI agents; 5% trust them to ship, Cisco adds zero‑trust limits
Why are so many companies still hesitant to let AI agents go live? A fresh survey shows 85 % of enterprises have already deployed agents, yet only 5...
OpenAI says Musk cannot prove promise from Altman, lacks standing in case
Why does this matter? The courtroom drama between Elon Musk and OpenAI has moved beyond a personal spat to a test of corporate governance.
Agent Improvement Loop Starts with Trace, Enabling Deterministic, Low‑Cost Validation
Why does an “agent improvement loop” start with a trace? In open‑source tooling, the first step often feels like a bookkeeping exercise—capturing...
OpenAI's 'Spud' Beats Claude; April 30 Webinar on Agentspan 4‑Layer Production
OpenAI’s new model, nicknamed “Spud,” has just outperformed Claude in the latest benchmark, a shift that’s already sparking talk among developers...
Why ChatGPT and Other Bots May Mislead You on Financial Advice
The headline flags a growing concern: chat‑driven assistants aren’t built to be financial counselors.
Google DeepMind's Decoupled DiLoCo hits 88% goodput despite hardware failures
Google DeepMind’s latest paper unveils Decoupled DiLoCo, an asynchronous training framework that keeps more than eight‑in‑ten chips busy even when a...
Claude adds direct connectors for Spotify, Uber Eats, TurboTax; mobile beta
Anthropic’s Claude just got a functional upgrade that goes beyond chat. The company announced a suite of “app connectors” that let the model reach...
Agent observability powers production evaluation through trace analysis
When you push an AI assistant from a sandbox into real‑world use, the interaction patterns suddenly explode.
OpenAI launches GPT-5.5, hits 82.7% on Terminal-Bench 2.0, 84.9% on GDPval
OpenAI just rolled out GPT‑5.5, a fully retrained agentic model that clocks 82.7 % on Terminal‑Bench 2.0 and 84.9 % on GDPval.
Microsoft adds ’vibe working’ to Word and Excel; Copilot Agent Mode now default
Microsoft is nudging its productivity suite toward a more conversational rhythm. The company rolled out a feature dubbed “vibe working” across Word,...
Industry Shifts to Richer Context for AI Agents, Guided by Human Judgment
Why does the way we feed AI matter? In the first wave of autonomous assistants, developers handed models a lone system prompt and a handful of tool...
Anthropic's Mythos Leak Precedes Bland AI's Norm Voice Agent Builder
Anthropic’s recent Mythos leak has sparked a quiet buzz among developers who’ve long watched the company’s models stay under lock and key.
Trump 'saved' women from execution—AI‑fabricated; account hit Lee Jae‑myung
A thread circulating on X claims former president Donald Trump rescued eight Iranian women from execution—a story that, on closer look, mixes genuine...
Xiaomi launches MiMo‑V2.5‑Pro and V2.5, matching benchmarks at lower token cost
Xiaomi’s latest AI rollout—MiMo‑V2.5‑Pro and its lighter‑weight sibling MiMo‑V2.5—promises the same headline‑grabbing benchmark scores as leading...
Designing Production-Grade CAMEL Multi-Agent Systems: Start with Docs and GitHub
Designing a production‑grade CAMEL multi‑agent system isn’t just about swapping in the latest planning algorithm or tinkering with tool‑use hooks.