📂 Category
Research & Benchmarks News Archive - Page 4 of 5
475 articles in this category • Page 4 of 5
- 301. X limits Grok image tool to paid users; 1 obscene request/min, 102 in 5 mins
- 302. Hanns Christoph Nägerl’s team finds quantum heating defies classical intuition
- 303. Replit CEO says using more tokens yields higher-quality inputs, then tests apps
- 304. Dell says AI-focused PCs confuse consumers, who show little interest
- 305. Vibe Coding Remains Early Stage, Real-World Reliability Still Distant
- 306. New Magnetic Nanoparticle Approach Merges Heating and Healing for Bone Cancer
- 307. Tredence hosts AI Foundry workshop in Chennai for AI system designers
- 308. Test-Time Training adds dual-memory to Transformers, keeping inference cheap
- 309. Analysis overhauls AI Index; GPT-5.2 beats professionals on 70.9% of tasks
- 310. MIT study probes memorization risk of clinical AI with de-identified data
- 311. AMD announces Ryzen AI 400 at CES, resembles AI 300 in laptops
- 312. Docker Trick: Deterministic OS Packages in One Layer to Prevent ML Failures
- 313. Notion’s simplified AI agent feature feels indispensable, says engineer
- 314. DeepSeek's architectural fix improves large-scale reasoning, follows GRPO work
- 315. Nested Learning's Continuum Memory System Redefines AI Memory for 2026
- 316. New framework lets agentic AI tools adapt to fill main agent knowledge gaps
- 317. Opera Neon: AI-native browser that researches, compares prices, codes
- 318. Fusion reactors could produce dark-sector particles via neutron emissions
- 319. CIOs drive AI experiments by embedding ready-to-use features into everyday tools
- 320. Dell and NVIDIA Host AI Developer Meetup in Hyderabad to Discuss Solutions
- 321. Google highlights AI-driven chip, infrastructure and robotics advances in 2025
- 322. LeCun and Hassabis dispute meaning of ‘general intelligence’
- 323. Qwen3-4B-Instruct-2507: 4B-parameter model boosts Raspberry Pi AI
- 324. Dell and NVIDIA host AI developer meetup in Bengaluru on deployment trade-offs
- 325. GPT-5.2 leads FrontierScience test, but falters on real research tasks
- 326. OpenUSD and NVIDIA Halos Enhance Robotaxi Safety with Synthetic Data, SimReady
- 327. Audio Dataset Valuable for Listening Models, Tackles Noise, Accents, Timing
- 328. Fastweb and Vodafone use LangGraph LLM Compiler to automate customer requests
- 329. GPT-5.2 Thinking emerges as collaborative AI for end-to-end web builds
- 330. YouTube channel serves AI concepts in under-minute clips for fast learning
- 331. AI Ends Build-vs-Buy Debate, Focus Shifts to Real Business Impact
- 332. Google, MIT study finds multi-agent AI often loses context in sequential tasks
- 333. Researchers find complex AI persona tactics hurt meaning in development
- 334. AI2 releases Olmo 3.1 32B Think, up 5+ points on AIME and 4+ on ZebraLogic
- 335. Pangram 3.0 AI detector reports 99.98% accuracy, adds four usage tiers
- 336. Experts say data centers' water use is less risky than public perceives
- 337. U.S. Leads Pax Silica Initiative Launched at Summit to Secure Silicon Supply
- 338. Gemini Deep Research agent posts top results on HLE, DeepSearchQA, leads BrowseComp
- 339. Google's FACTS benchmark shows 70% factuality ceiling across four tests
- 340. LangSmith Fetch lets Claude Code, Cursor agents debug from terminal
- 341. SAP deploys 95%-accurate AI to redefine consultant role by 2030
- 342. TPOT evolves ML pipelines via genetic algorithms in four steps
- 343. Model distillation cuts latency 2-3× and lowers costs by double-digit percentages
- 344. Googler details meta-prompt technique that guides Gemini to craft Veo videos
- 345. CognitiveLab unveils NetraEmbed, 150% accuracy gain, adds ColNetraEmbed
- 346. Student AI models can inherit bias and harmful traits from teacher models
- 347. 70% of Creatives Fear Stigma as AI Drives Majority of Their Ideas – Anthropic
- 348. Corporate AI agents favor simple workflows; 41.5% accept minute-range latency
- 349. Bright Data API Delivers Seamless AI/ML Integration and Anti-Bot Protection
- 350. AI agents claim sources verified despite dead links; 14 error types logged
- 351. Harbor Framework Enables Sandbox Agent Execution on Docker, Modal, Daytona
- 352. Anthropic puts Claude in the interviewer's chair for AI testing
- 353. Physicist Steve Hsu releases paper on AI-assisted physics using GPT-5 idea
- 354. OpenAI trials “Confessions” tool that makes models generate self-audit reports
- 355. NVIDIA offers up to USD 60,000 fellowships to PhD students for model collaboration
- 356. NVIDIA cuts prices on Jetson edge-AI developer kits for holiday shoppers
- 357. Anthropic faces pressure as CEO Dario Amodei backs AI regulation
- 358. Harvard Data Course Runs 66 Weeks, Costs USD 1,332.90 (~Rs 1.18 Lakh)
- 359. Gemini 3 Pro tops trust, ethics, safety at 69% vs 16% for Gemini 2.5
- 360. Counter-Strike Sets New Benchmark for Vibe Coding, Says Ex-Mixpanel CEO
- 361. NVIDIA open-sources NeMo Data Designer for synthetic AI datasets at NeurIPS
- 362. AI models stop 87% of attacks but only 8% of attempts; Qwen3-32B hits 86.18%
- 363. Runway's Gen-4.5 text-to-video AI claims unprecedented physical accuracy
- 364. AI Solves 30-Year-Old Math Problem, Showcasing Perplexity's Patent Search Tool
- 365. OpenAGI agent says it beats OpenAI and Anthropic; study deems over-optimistic
- 366. Google's self-modifying model needs extra engineering, smarter compute for complex training
- 367. ARC benchmark declines as labs tune AI to optimize its specific logic
- 368. General Agentic Memory uses dual-agent design, beats RAG on benchmarks
- 369. NeurIPS 2025: Top 4 Papers Highlight Shift From Bigger Models to Limits
- 370. 97% Can't Distinguish AI Music; 71% Surprised, 51% Uncomfortable
- 371. TPUs Designed for Deep Learning Can Outperform GPUs in Many Workloads
- 372. Wipro partners with IISc and FSID for AI and quantum research collaboration
- 373. Google expands AI partnership with Tel Aviv University, infrastructure for Gemma
- 374. Alibaba's AgentEvolver lifts tool-use accuracy ~30% via auto-generated tasks
- 375. Set Seed in XLMiner: Use Integer 12345, 42, 2024 for Consistent Partitions
- 376. Karpathy says AI-homework crackdown failed, urges in-class grading shift
- 377. Digital Connexion to Invest USD 11 Billion in Andhra Pradesh AI Data Centres
- 378. Cecilia Heyes labels language a 'cognitive gadget' for precise social learning
- 379. DOE orders cloud, labs, and network integration for AI Genesis mission in 90 days
- 380. CrowdStrike's Stein finds DeepSeek-R1 adds 50% more bugs on Chinese prompts
- 381. Microsoft's Fara-7B AI agent, rival to GPT-4o, runs on PC, logs 145k tasks
- 382. Authors retract brain-mapping paper after reviewers flag fabricated citations
- 383. CrewAI Introduces Function-Based Guardrails for Rule-Based Output Constraints
- 384. Anthropic finds strict anti-hacking prompts increase AI sabotage and lying
- 385. M-GRPO Boosts Coordination in Multi-Agent Training Over Single-Agent GRPO
- 386. Google DeepMind hires ex-Boston Dynamics CTO to create Gemini AI for any robot
- 387. NotebookLM Turns Complex Spreadsheets into Presentation Insights
- 388. Use Temporal Patterns: Plot Timestamps to Spot Seasonality, Trends, Shifts
- 389. MIT Energy Initiative conference highlights storage research priorities
- 390. ServiceNow uses LangSmith, knowledge graph and MCP to orchestrate agents
- 391. OpenAI researcher details new AI model using general RL, no code interpreters
- 392. WeatherNext 2 data now in Earth Engine, BigQuery; Vertex AI early access opens
- 393. Stereogum persists amid streaming, AI and dwindling ad revenue as ads dry up
- 394. DeepEyesV2 Beats Larger Open-Source Models by Leveraging Search Tools
- 395. Google AI agents: consistency, context, short-term session history, long-term memory
- 396. Researchers push Context Engineering 2.0 as AI moves from Era 2.0 to 3.0
- 397. OpenAI finds sparse models aid debugging, may boost mechanistic interpretability
- 398. RDMA Cuts CPU Use in S3-Compatible Storage, Boosting AI Performance
- 399. Indian language ID proves tough; authors release baseline ML models
- 400. NVIDIA Blackwell Wins All MLPerf Training v5.1 Benchmarks with FP4 Accuracy