Skip to main content
OpenAI scales PostgreSQL for 800M users, optimizing databases for AI applications and semantic search. [timescale.com](https:

Editorial illustration for OpenAI scales PostgreSQL to support 800 million users through optimization

OpenAI Scales PostgreSQL for 800M User Surge

OpenAI scales PostgreSQL to support 800 million users through optimization

2 min read

Scaling a database to serve 800 million users isn’t a headline‑grabbing tech stunt; it’s a daily reality for the teams behind ChatGPT and the OpenAI API. While many firms reach for brand‑new architectures when traffic spikes, OpenAI chose a different path—tweaking what they already trusted. The engineers behind the scenes spent months dissecting query patterns, sharding strategies, and connection pooling, all without abandoning PostgreSQL’s core codebase.

Their goal was simple: squeeze more throughput out of a system that had already proven reliable at massive scale. The result is a PostgreSQL deployment that now handles the same workloads that once required bespoke solutions elsewhere. This approach raises a question worth pondering: when does it make sense to push an existing stack to its limits rather than rebuild from scratch?

The answer, according to OpenAI’s own engineers, lies in deliberate optimization. As Boh, an OpenAI engineer, puts it, “For years, PostgreSQL has been one of the most critical, under‑the‑hood data systems powering core products like ChatGPT and OpenAI's API.”

OpenAI's PostgreSQL setup shows how far proven systems can stretch when teams optimize deliberately instead of re-architecting prematurely. "For years, PostgreSQL has been one of the most critical, under-the-hood data systems powering core products like ChatGPT and OpenAI's API," OpenAI engineer Bohan Zhang wrote in a technical disclosure. "Over the past year, our PostgreSQL load has grown by more than 10x, and it continues to rise quickly." The company achieved this scale through targeted optimizations, including connection pooling that cut connection time from 50 milliseconds to 5 milliseconds and cache locking to prevent 'thundering herd' problems where cache misses trigger database overload.

Related Topics: #PostgreSQL #OpenAI #ChatGPT #Database Scaling #Connection Pooling #Azure Database #Read Replication #Performance Optimization #AI Infrastructure

Can a single-primary PostgreSQL truly sustain 800 million users? OpenAI says it does, running ChatGPT and its API on one Azure PostgreSQL Flexible Server for all writes. Nearly fifty read replicas spread across multiple regions handle the bulk of traffic.

The company stresses that deliberate optimization, not wholesale re‑architecting, allowed the system to stretch. “For years, PostgreSQL has been one of the most critical, under‑the‑hood data systems,” an OpenAI engineer noted, underscoring its long‑standing role. Yet the blog post stops short of detailing latency metrics or failure‑mode testing, leaving performance under peak load somewhat opaque.

While vector databases remain relevant for certain workloads, OpenAI’s choice highlights that proven relational engines can still meet massive scale when tuned carefully. Unclear whether this model will hold as usage patterns evolve or as new features demand different data access patterns. The approach, therefore, offers a concrete example of scaling through configuration rather than architectural overhaul, but its broader applicability remains uncertain.

Further Reading

Common Questions Answered

How did OpenAI scale PostgreSQL to support millions of users without sharding?

OpenAI used a single-primary database architecture with multiple read replicas on Azure Database for PostgreSQL. They optimized their approach by reducing unnecessary writes, using lazy writes, and carefully managing write loads to prevent bottlenecks in their database system.

What were the main challenges OpenAI faced with PostgreSQL's Multi-Version Concurrency Control (MVCC) design?

OpenAI encountered challenges with table and index bloat, complex autovacuum tuning, and increased Write-Ahead Logging (WAL) that could lead to replication lag. As the number of replicas grew, they also had to manage potential network bandwidth constraints.

How did OpenAI manage write scalability in their PostgreSQL infrastructure?

OpenAI addressed write scalability by offloading write operations wherever possible, avoiding unnecessary writes at the application level, and using controlled backfills and lazy writes to smooth out write spikes. They also strategically migrated extreme write-heavy workloads with natural sharding keys to other systems.