HomeCategoriesAll Tags

UUID vs SERIAL Primary Keys in SQL Databases

Should your relational database primary key be a UUID or a SERIAL/IDENTITY auto-increment integer? Let’s do a deep, practical, systems-level comparison.

Choosing a primary key is one of those “small” schema decisions that quietly affects performance, replication strategy, sharding, API design, caching, privacy, and even incident recovery.

This post compares two common choices for surrogate primary keys in SQL databases:

  • Monotonic integer (SERIAL, BIGSERIAL, IDENTITY, AUTO_INCREMENT)
  • UUID (UUID type in Postgres, BINARY(16) / CHAR(36) patterns in MySQL)

I’ll focus on what matters in practice: storage, indexing, write amplification, locality, distributed systems, security implications, and operational ergonomics.


Quick Definitions (So We’re Talking About the Same Thing)

SERIAL / IDENTITY (Auto-increment integer)

  • PostgreSQL: historically SERIAL/BIGSERIAL, modern: GENERATED {ALWAYS|BY DEFAULT} AS IDENTITY
  • MySQL: AUTO_INCREMENT

These generate an increasing integer (or bigint) per row.

UUID

A 128-bit identifier, usually rendered as xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.

Common versions:

  • UUIDv4: random (very common)
  • UUIDv1: time+node based (ordered-ish, but leaks info)
  • UUIDv7: time-ordered (modern choice for “UUID but index-friendly”)

In relational DBs, UUIDs can be stored:

  • As a native UUID type (Postgres)
  • As BINARY(16) (recommended for MySQL when using UUIDs)
  • As CHAR(36) (works, but bigger + slower)

The Core Tradeoff

If you boil everything down:

  • Auto-increment integers optimize for storage + write locality + simple indexing.
  • UUIDs optimize for global uniqueness + distributed generation + safer public exposure.

Neither is “always better”. The right choice depends on how your system is deployed and how identifiers flow through it.


Comparison Matrix (High Signal)

DimensionSERIAL/IDENTITY (int/bigint)UUID (v4 random)UUID (v7/time-ordered)
Storage for PK4 bytes (int) / 8 bytes (bigint)16 bytes (binary) + type overhead16 bytes (binary) + type overhead
Index sizeSmallLargerLarger
Insert locality in B-treeExcellent (append-ish)Poor (random)Good (mostly increasing)
Write amplification / page splitsLowHigherLower than v4
Generation locationUsually DBApp, DB, anywhereApp, DB, anywhere
Multi-writer / offline creationHardEasyEasy
Sharding / merging datasetsHarderEasierEasier
Public exposure (enumeration risk)High unless mitigatedLowerLower
Debuggability / ergonomicsGreat (short)Mixed (long)Mixed (long)
Cross-system uniquenessNeeds coordinationBuilt-inBuilt-in

Note: “UUID (v4 random)” is the worst-case for index locality. If you want UUID benefits without killing write patterns, consider UUIDv7 (or ULID-like time-ordered IDs).


1) Storage & Index Cost (This Matters More Than People Think)

Primary keys aren’t just a column. They often sit inside multiple indexes.

Why UUIDs can get expensive

In most relational engines, secondary indexes store:

  • the indexed column(s)
  • plus the primary key (as the “row pointer” / clustering key)

So if your PK is large, every secondary index entry becomes larger.

Practical effect:

  • bigger indexes = more RAM needed for hot working set
  • more IO = slower queries under load
  • more storage = more backups + slower restores

Rule of thumb

  • If your table has many secondary indexes, UUID PKs tend to cost more than you expect.
  • If you need UUIDs, storing them efficiently (native UUID / BINARY(16)) helps a lot.

2) B-Tree Locality, Page Splits, and Write Amplification

Most SQL databases use B-tree indexes for primary keys.

Auto-increment: append-friendly

Increasing keys mean inserts happen near the “right edge” of the B-tree.

  • fewer page splits
  • better cache locality
  • often higher sustained write throughput

UUIDv4: random inserts

Random keys insert all over the tree:

  • more page splits
  • more fragmented pages
  • worse locality
  • can increase WAL/binlog + IO pressure

This difference becomes very visible at scale (high insert rates, large tables, constrained IO).

UUIDv7: best of both worlds (often)

UUIDv7 is time-ordered. Inserts tend to cluster, closer to append-like behavior while staying globally unique.

If your motivation for UUID is distributed generation (not “randomness”), UUIDv7 is usually a stronger choice than UUIDv4.


3) Replication, Multi-Region, and “Who Gets to Mint IDs?”

SERIAL/IDENTITY assumes a single allocator

With auto-increment IDs, one place is effectively responsible for ID allocation:

  • The primary DB node (common)
  • or a coordinated ID service (less common)

This is fine for single-primary setups. It gets trickier when:

  • you have multiple writers
  • you have offline/edge writes
  • you merge data from multiple sources

You can solve it with sequences-per-shard or ID blocks, but that’s extra design work.

UUIDs are coordination-free

UUIDs let every writer generate IDs safely:

  • mobile clients can create objects offline
  • multiple regions can insert without pre-allocating blocks
  • multiple services can create records independently

If you anticipate multi-writer or eventual multi-region writes, UUIDs are often operationally simpler.


4) Sharding and Data Migrations

Sharding isn’t only for huge companies. Even modest systems shard for:

  • tenant isolation
  • compliance
  • cost management
  • burst handling

SERIAL can collide across shards

If shard A and shard B both generate id=1..N, merging datasets or moving tenants between shards becomes awkward.

Solutions exist:

  • allocate non-overlapping ranges per shard (e.g., shard_id prefix scheme)
  • use a “Snowflake”-style ID generator
  • use bigint keys with embedded shard bits

UUID makes merges easier

UUIDs don’t collide by design. When you:

  • merge databases
  • restore and replay data
  • move records between shards

…you don’t have to worry about PK renumbering or range collisions.


5) Security & Privacy: Enumerability Is Real

Auto-increment IDs are predictable. That creates risks:

  • GET /orders/10001 → try /orders/10002
  • scraping, reconnaissance, and data inference

You can mitigate with authorization, of course—but predictable IDs increase the blast radius of mistakes.

If IDs appear in URLs or external APIs

  • UUIDs are generally safer to expose publicly.
  • Integer IDs can still be safe, but you must be disciplined with access control.

Common pattern in mature systems:

  • internal PK: bigint
  • public identifier: UUID (or another opaque token)

This gives performance internally and safety externally.


6) Human Factors: Debuggability and Operational Ergonomics

This is underrated.

Integers are pleasant

  • easy to read in logs
  • easy to paste
  • easy to compare
  • easy to eyeball ordering

UUIDs can be painful

  • logs become noisy
  • people mistype them
  • some systems wrap lines, making copy/paste annoying

If your team spends a lot of time debugging production issues with ad-hoc SQL, bigint IDs can materially improve day-to-day speed.


7) Query Patterns and “Hotspot” Risks

Auto-increment can hotspot

Because inserts hit the end of the index, contention can appear on high concurrency workloads (depending on engine internals and isolation level). Modern engines handle this reasonably well, but it can show up.

Random UUID distributes writes

Randomness spreads insert locations, which can reduce certain right-edge contention patterns—but at the cost of page splits and cache locality.

In practice:

  • If you want distribution without fragmentation, prefer time-ordered IDs (UUIDv7) over UUIDv4.

Concrete SQL Examples

PostgreSQL: IDENTITY (bigint) primary key

CREATE TABLE users (
	id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
	email TEXT NOT NULL UNIQUE,
	created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

PostgreSQL: UUID primary key (DB-generated)

If you want the DB to generate UUIDs:

-- Depending on your setup, you might use pgcrypto
CREATE EXTENSION IF NOT EXISTS pgcrypto;

CREATE TABLE users (
	id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
	email TEXT NOT NULL UNIQUE,
	created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

If you generate UUIDs in the application (Node.js), you’d insert explicitly.

MySQL: AUTO_INCREMENT

CREATE TABLE users (
	id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
	email VARCHAR(255) NOT NULL,
	created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
	PRIMARY KEY (id),
	UNIQUE KEY uq_users_email (email)
);

MySQL: UUID stored as BINARY(16) (recommended style)

Storing UUID as CHAR(36) works but is larger and slower. BINARY(16) is typically better.

CREATE TABLE users (
	id BINARY(16) NOT NULL,
	email VARCHAR(255) NOT NULL,
	created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
	PRIMARY KEY (id),
	UNIQUE KEY uq_users_email (email)
);

You then convert at the app boundary (store as 16 bytes, display as string).


Scenario-Based Guidance (When Each Choice Wins)

Choose SERIAL/IDENTITY when…

  1. You have a single primary writer (or a straightforward primary-replica topology)
  • Most CRUD apps, internal tools, typical web backends.
  1. You need maximum write throughput and predictable performance
  • High insert rate tables, heavy OLTP workloads.
  1. You have many secondary indexes
  • Smaller PK reduces secondary index bloat.
  1. IDs are mostly internal
  • IDs not exposed publicly, or you can safely avoid enumeration.

Concrete example:

  • A monolithic Postgres app with users, orders, payments, moderate scale, lots of joins, heavy analytics read queries.

Choose UUID when…

  1. You need distributed ID generation
  • Multiple services create records independently.
  • Mobile/offline clients create entities and sync later.
  1. You expect database merges / tenant moves / sharding
  • Merging datasets without PK collisions becomes dramatically easier.
  1. IDs are exposed publicly
  • Public REST resources benefit from non-enumerable IDs.

Concrete example:

  • A multi-tenant SaaS with separate databases per region, periodic tenant migration between clusters.

Prefer UUIDv7 (or time-ordered IDs) if you choose UUID

If you want UUID benefits without the worst index fragmentation behaviors:

  • UUIDv4 → simplest but worst locality
  • UUIDv7 → better locality, still globally unique

A Very Practical “Best of Both” Pattern

Many teams end up here:

  • Primary key: BIGINT (fast, small, index-friendly)
  • Public ID: UUID (opaque, safe in URLs)

Example:

CREATE TABLE orders (
	id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
	public_id UUID NOT NULL DEFAULT gen_random_uuid(),
	user_id BIGINT NOT NULL,
	amount_cents INT NOT NULL,
	created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
	UNIQUE (public_id)
);

Your API uses public_id:

  • GET /orders/{public_id}

Your internal joins use id:

  • smaller/faster indexes
  • friendlier debugging

This pattern is especially good if you want safe external references but don’t actually need distributed minting of the primary key.


Common Myths (And What’s Actually True)

“UUIDs are always slower”

Not always—but UUIDv4 PKs commonly cause slower inserts at scale due to index locality and larger indexes. UUIDv7 can narrow the gap.

“Integers are insecure so you must use UUID”

Security comes from authorization and access control, not ID format. But predictable IDs do increase risk when a bug slips in.

“UUIDs are required for microservices”

Microservices require clear boundaries and contracts. UUIDs can help with distributed creation and cross-system merging, but they’re not mandatory.


Recommendations (If You Want a Default)

If you want a pragmatic default for most SQL-backed web apps:

  • Use BIGINT IDENTITY (or AUTO_INCREMENT) as the primary key.
  • Add a separate public_id UUID if the ID leaks into URLs, logs, or integrations.

Choose UUID as the primary key when distributed generation or cross-database merges are first-class requirements.

If choosing UUID PKs:

  • prefer time-ordered UUIDs (like UUIDv7) if available in your stack
  • store efficiently (UUID type in Postgres, BINARY(16) in MySQL)

Conclusion

SERIAL/IDENTITY keys are excellent for performance and simplicity in single-writer relational systems. UUIDs shine when you need coordination-free global uniqueness, safer public exposure, and easier cross-system merging.

The most robust real-world approach is often a hybrid: bigint PK for the database, UUID for the outside world.

I hope this was useful. Feel free to drop your comments below.

- Ayush 🙂