Research & Benchmarks - Page 6 of 13
Academic AI research, performance benchmarks, scientific breakthroughs, and peer-reviewed studies advancing artificial intelligence frontiers.
Academic AI research, performance benchmarks, scientific breakthroughs, and peer-reviewed studies advancing artificial intelligence frontiers.
The quest to make artificial intelligence sound more human-like just hit a surprising roadblock. Researchers diving into AI language development have uncovered a countersimple challenge that could stump even the most advanced tech teams.
The race for AI mathematical prowess just got more interesting. Researchers at the Allen Institute for AI (AI2) have unveiled Olmo 3.1 32B Think, a new large language model that's turning heads with its impressive performance on complex reasoning...
AI detection just got a major upgrade. Pangram, a leading technology firm, has dramatically improved its detection capabilities with a new version that promises near-perfect accuracy in identifying AI-generated text.
Water scarcity sparks heated debates about technology's environmental footprint. But when it comes to data centers, those massive digital warehouses powering our online world, the reality might surprise most people.
Silicon has become the new geopolitical battleground, and the United States is making a strategic move. At a landmark gathering this week, U.S.
Google's AI research just hit another milestone, and this time, it's about more than just raw computational power.
AI's reliability problem just got a rigorous scientific assessment. Google researchers have developed a new benchmark called FACTS that exposes critical limitations in large language models' ability to consistently deliver accurate information.
Debugging code just got a serious upgrade for AI developers. LangSmith's latest tool promises to transform how coding agents tackle complex software challenges, offering unusual transparency into AI-powered development workflows.
The consulting world is bracing for a seismic shift. SAP is betting big on artificial intelligence, with an ambitious goal to transform how consultants work by developing AI systems that can deliver 95% accuracy by 2030.
Machine learning pipeline design has long been a complex, time-consuming process that demands significant human expertise.
AI researchers are uncovering a powerful technique that could dramatically reshape how machine learning models are deployed and scaled.
Video generation just got a lot more nuanced. A Google engineer has uncovered a clever technique for guiding Gemini's video creation process, and it doesn't involve traditional prompting methods.
Language barriers in search just got a major upgrade. CognitiveLab's new NetraEmbed technology promises to transform how multilingual document searches work, delivering breakthrough performance that could reshape digital information retrieval.
The dark side of artificial intelligence just got darker. New research reveals a troubling phenomenon in machine learning: AI student models can secretly inherit toxic behaviors from their "teacher" algorithms, even when seemingly protected by...
The creative world is grappling with an unsettling transformation. A new study reveals that artificial intelligence is rapidly reshaping artistic inspiration, with 70% of creatives experiencing deep anxiety about their evolving relationship with...
Corporate AI is reshaping workplace productivity, but not in the ways many expect.
Web data extraction just got smarter, and more secure. Bright Data's latest API release tackles one of the most complex challenges facing AI and machine learning teams: gathering high-quality web data without getting blocked.
In the high-stakes world of artificial intelligence, trust hinges on accuracy. But what happens when AI systems confidently claim they've verified sources, while quietly hiding a web of errors?
AI research just got a serious upgrade in testing and evaluation. Developers wrestling with agent performance across different cloud platforms now have a powerful new tool at their disposal.
AI research is getting weird, and fascinating. Anthropic just ran an unusual experiment by turning its own chatbot Claude into an interviewer, flipping traditional testing protocols on their head.
Learn to build AI-powered apps without coding. Our comprehensive review of No Code MBA's course.
Curated collection of AI tools, courses, and frameworks to accelerate your AI journey.
Get the week's most important AI news delivered to your inbox every week.