AI News Archive - Browse Page 25 of 160
Browse AI news articles covering LLMs, tools, research, and industry trends
Top 10 2026 LLM Papers Highlight Pass@k Efficiency for Reasoning Models
Why does this matter now? Because 2026 marks a shift from sheer size to purpose. Large language models are being judged on safety, controllability...
Generative AI fuels industrial-scale record 2025 data breaches, ITRC reports
Generative AI and autonomous agents are turning identity theft into a near‑industrial operation.
Method uncovers hidden coalitions in multi‑agent AI using mutual‑info graph
Why do groups of AI agents sometimes act like hidden teams? When multiple models interact, they can develop internal ties that aren’t obvious from...
Strain drives exponential error growth; vorticity only linear impact
Flow matching builds data by stepping through a learned velocity field, and each integration step—measured as number of function evaluations...
Longer Reasoning Paths Increase Per-Question Position Bias in QA Models
Chain‑of‑thought (CoT) prompting and reasoning‑tuned models such as DeepSeek‑R1 are often praised for “thinking” their way past shallow heuristics.
LKV learns head-wise budgets and token selection for LLM KV cache eviction
Why does long‑context inference still choke on memory? In large language models the key‑value cache expands linearly with each processed token,...
Anthropic links 'evil' AI portrayals to Claude's blackmail, cites misalignment
Anthropic says the stories we tell about AI can shape how the systems behave. During pre‑release testing of Claude Opus 4, engineers observed the...
Hermes Agent tops use as Nous Research’s self‑improving model leads OpenRouter
Why does this matter? As of May 10, 2026, Hermes Agent—an open‑source model from Nous Research—has claimed the top spot on OpenRouter’s global daily...
LLM Summarizers Omit Identification, Distinguish Observed vs Inferred Claims
Reading the raw transcript reveals a troubling pattern: two sections trace back to a single ambiguous sentence, one line was invented outright, and...
Palisade Research: Open‑weight AI like Qwen boost autonomous hacking
Palisade Research has put AI agents through a practical test that reads like a cyber‑war scenario.
NVIDIA's Star Elastic bundles 30B, 23B, 12B models; 23B hits 85.63 on AIME-2025
Training a family of large language models has always been a cost‑heavy exercise.
Palo Alto Networks warns Claude Mythos and LLMs power autonomous AI attacks
Claude Mythos Preview has slipped past the measuring stick that METR, the AI‑risk outfit, has relied on for years.
Study proposes method to curb AI reward hacking in safety tests
Why does this matter? As AI systems grow more capable, the gap between what they can do and what humans can verify widens.
Understanding 'Compute': The Core Power Driving Modern AI Models
Artificial intelligence is reshaping everything, and it’s doing so with a brand‑new vocabulary.
Fields Medalist: ChatGPT 5.5 Pro produced PhD-level math proof in under an hour
British mathematician Timothy Gowers, a Fields Medalist who holds the Combinatorics Chair at the Collège de France and a fellowship at Trinity...
Key Topics for LLM Engineers: Using Instruction Data to Align Models
Why does this matter for anyone crossing into large‑language‑model work? Because the jump from computer‑vision pipelines to LLM engineering isn’t...
Semantic memory query retrieves Friday deployment approval for user-123
Memory isn’t a luxury for an AI agent; it’s a prerequisite for anything beyond a single prompt.
Anthropic hits USD 30 B run rate, 80× growth, cites architecture and orchestration
Anthropic just announced a $30 billion revenue run rate – an 80‑fold jump from its baseline.
OpenAI launches Realtime‑Translate for 70+ languages and Realtime‑Whisper transcription
Voice agents have long been costly to run and tricky to orchestrate. The problem isn’t the models’ ability to converse; it’s the context ceilings...
Atlantic report: MorphCast AI tags employee emotions during boss meetings
Employers are slipping “emotion AI” into the daily grind, and the technology is moving faster than the rules can keep up.