Research & Benchmarks - Page 7 of 24
Academic AI research, performance benchmarks, scientific breakthroughs, and peer-reviewed studies advancing artificial intelligence frontiers.
Academic AI research, performance benchmarks, scientific breakthroughs, and peer-reviewed studies advancing artificial intelligence frontiers.
Anthropic has rolled out a fresh benchmark aimed at gauging Claude’s performance against seasoned bioinformatics specialists. The suite comprises 84 distinct tasks, spanning everything from protein‑fold prediction to pathway analysis.
Grok Voice Think Fast 1.0 promises a shortcut for anyone curious about voice‑driven AI, sidestepping the usual code‑heavy entry barrier.
When an AI system flags a single visual bias, engineers often scramble to patch that flaw, only to see new distortions surface elsewhere.
Privacy‑preserving AI has long been a research niche, but recent work pushes it out of the lab and onto the phones, tablets and smart speakers that sit on kitchen counters.
Why does this matter now? Musk has spent the week railing against Sam Altman, insisting he was “duped” by the OpenAI chief and warning that “AI could kill us all.” While the rhetoric dominates headlines, another, less flashy development is taking...
Why does the way BioNeMo handles a single computational layer matter for researchers modeling proteins or nucleic acids?
DeepSeek opened its doors to the public on Friday, rolling out a preview of the V4 family and making both versions downloadable through its platform and API.
Poolside AI’s latest rollout adds two agentic coding models—Laguna XS.2 and M.1—to its growing portfolio.
Oracle’s name has been tied to on‑premise databases and sprawling ERP suites for decades. Yet the company’s latest moves suggest a wholesale break with that past.
Tool‑calling agents have become a staple in recent AI research, yet their reliability often hinges on how they handle mistakes during execution.
Why do larger language models keep getting better, even when they’re trained on the same data? Researchers at MIT think they’ve found a clue hidden in the models’ own geometry.
A recent report revealed that Google has been discussing potential collaborations with the Pentagon, sparking unease across the company’s engineering floors.
The new ASI‑EVOLVE framework claims to automate three core pillars of machine‑learning pipelines—curating training data, selecting model architectures, and tweaking optimization algorithms—while beating manually crafted baselines.
Why does drug discovery still feel like assembling a jigsaw puzzle in the dark? Researchers must juggle dozens of niche programs—each handling a single step from binding affinity prediction to ADMET profiling—while trying to keep the whole pipeline...
Enterprises are wrestling with a familiar problem: data sits in silos, and the people who need it are spread across dozens of departments, sometimes numbering in the thousands.
Why does this matter? Companies deploying retrieval‑augmented generation (RAG) often chase tighter precision by tweaking the underlying embedding layers, assuming tighter vectors will feed cleaner results to downstream agents.
The latest research on AI pipelines spotlights a problem that’s been slipping under the radar for months.
The research community has long leaned on benchmarks that ask language models to solve problems without ever touching a keyboard or mouse.
Why does this matter? Traditional retrieval‑augmented generation (RAG) leans on dense vector stores to pull relevant passages, but PageIndex proposes a different path.
Here's the thing: Anthropic rolled out a preview of its Mythos AI model earlier this year, promising a tool that can spot security flaws in software and networks faster than most scanners.
Learn to build AI-powered apps without coding. Our comprehensive review of No Code MBA's course.
Curated collection of AI tools, courses, and frameworks to accelerate your AI journey.
Get the week's most important AI news delivered to your inbox every week.