Research & Benchmarks - Page 6 of 24
Academic AI research, performance benchmarks, scientific breakthroughs, and peer-reviewed studies advancing artificial intelligence frontiers.
Academic AI research, performance benchmarks, scientific breakthroughs, and peer-reviewed studies advancing artificial intelligence frontiers.
OpenAI has rolled out GPT‑5.5‑Cyber, a version of its language model stripped of many safety filters so that vetted security researchers can run penetration tests, dissect malware and review patches.
Why does this matter? Inference efficiency has quietly become one of the most consequential bottlenecks in AI deployment.
Model quantization trims VRAM demand and nudges inference speed on consumer‑grade hardware—think NVIDIA GeForce RTX cards.
Why does it matter that every chat you start with an LLM begins with a clean slate? While the tech can answer a question after you tell it who you are, the underlying context evaporates as soon as you close the tab. The result?
Google DeepMind is taking a minority stake in the studio behind the space MMO EVE Online, turning the game into a live‑lab for its next wave of AI research.
Evaluating AI models that read brain signals has been a tangled mess. Researchers juggle different preprocessing pipelines, train on disparate datasets, and report results on a handful of tasks, making it hard to tell which model truly works—or...
Normalizing Flows have been around for a while, but recent work has put them back in the spotlight.
Why does this matter? Large language models have shown they can reason about facts and navigate simulated environments, but their knack for inventively repurposing objects remains vague. Researchers have now put a concrete yardstick on that ability.
Adaptive optimizers such as AdamW treat every parameter group the same, even though different layers often behave quite differently during training.
Why does tweaking a single line of code in a neural‑architecture design sometimes ripple through an entire model? The paper “Structured Progressive Knowledge Activation for LLM‑Driven Neural Architecture Search” tackles that puzzle head‑on.
In this tutorial we stitch together a Groq‑powered research assistant that runs straight from Groq’s free OpenAI‑compatible inference endpoint.
Why does 6G matter now? While 5G is still being rolled out, engineers are already mapping the hardware and algorithms that could push wireless links toward 1 Tbps.
At the Code with Claude developers’ gathering in San Francisco, Anthropic rolled out a feature it calls “dreaming” for its Claude Managed Agents.
Why does a $200 billion spend on cloud matter? Anthropic has signed a five‑year agreement with Google Cloud that will see the AI startup pour roughly that amount into the provider’s infrastructure, starting next year.
Why does this matter? Because training frontier‑scale models now leans on supercomputer networks that can shuffle petabytes of data between GPUs without missing a beat.
Why does KV‑cache compression matter now? Large language models keep intermediate activations in a key‑value store, and that memory quickly becomes a bottleneck during inference.
Harvard’s latest foray into clinical AI pits cutting‑edge language models against seasoned physicians in a real‑world setting.
Why does a 2021 quantizer still matter when a 2026 version is on the books? The answer lies in how the two methods treat the numbers they compress.
The Center for AI Security and Innovation (CAISI), housed within NIST, has just released a multi‑domain benchmark that pits China’s newest large‑language model against a suite of American offerings.
DeepMind’s latest “AI co‑clinician” has just outscored GPT‑5.4 in a series of blind assessments, yet it still falls short of seasoned doctors. Why does that gap matter?
Learn to build AI-powered apps without coding. Our comprehensive review of No Code MBA's course.
Curated collection of AI tools, courses, and frameworks to accelerate your AI journey.
Get the week's most important AI news delivered to your inbox every week.