How Database Sharding Works - Benjamin Dickman

Published on 15 Jan 2026

database system design interview

As applications grow, a single database can eventually become a bottleneck. Queries slow down, storage limits approach, and scaling vertically (adding more CPU, RAM, or disk to one server) becomes expensive and unsustainable. That’s where database sharding comes in — a technique that splits data across multiple databases to improve scalability and performance.

In this post, we’ll explain what sharding is, why it’s used, the different sharding strategies, and the trade-offs you should understand before implementing it.

What Is Database Sharding?

Database sharding is the process of horizontally partitioning a large dataset into smaller, independent chunks called shards. Each shard contains a subset of the total data and is stored on its own database instance.

Think of it like splitting a huge phonebook into multiple smaller ones:

Instead of one giant table of users…
You store users across many databases…
Each shard holds only a portion of the users.

Applications route queries to the correct shard based on a shard key — a field such as user ID, account ID, or region.

Why Systems Use Sharding

Sharding is typically introduced when a database reaches its limits in one or more of the following areas:

Storage limits — the dataset no longer fits comfortably on a single node
Performance degradation — queries slow down as tables grow
Write throughput bottlenecks — one primary database cannot handle all writes
High read load — read replicas aren’t enough anymore
Business scale and regional growth

By distributing data across nodes, sharding enables:

more total storage capacity
higher throughput across shards
reduced contention and lock pressure
better performance at large scale

How Sharding Differs from Other Scaling Methods

Before adopting sharding, teams often use other scaling techniques:

Vertical scaling (scale-up): bigger server, more resources
Caching: reduces repeated reads
Read replicas: spread read load, but not writes
Partitioning inside a single database: logical separation but same node

Sharding goes a step further — it splits data across multiple independent databases, each operating as its own unit.

The Role of the Shard Key

The shard key determines how data is distributed. Choosing a good shard key is one of the most critical design decisions.

A good shard key should:

evenly distribute data across shards
avoid hotspots (too many writes on one shard)
support common query patterns
remain stable over time

A poor shard key can cause:

overloaded shards
unbalanced storage
cross-shard joins and queries
painful migrations later

Common Sharding Strategies

There are several ways to split data across shards. The right approach depends on your workload and data model.

1. Hash-Based Sharding

The shard key is hashed, and the hash determines the shard.

Example:

Pros

good load distribution
prevents hotspots

Cons

cross-user queries require multiple shards
resharding can be complex

Best for: large, uniform datasets such as user accounts.

2. Range-Based Sharding

Data is divided by a value range — for example:

Shard 1 → users 1–1,000,000
Shard 2 → users 1,000,001–2,000,000

Pros

efficient range queries
easy to reason about

Cons

hotspots when new data always lands in the same range
shards can grow unevenly

Best for: time-series data, logs, or sequential IDs (with safeguards).

3. Geographic / Tenant / Category Sharding

Data is grouped by logical boundaries such as:

region (EU, US, APAC)
customer tenant or organization
product line

Pros

improves data locality
simplifies compliance constraints
isolates tenant workloads

Cons

uneven tenant sizes
cross-region queries are harder

Best for: multi-tenant SaaS platforms and regional architectures.

How Applications Route Queries to Shards

Routing can be implemented in several layers:

Application-level routing — app determines shard before querying
Shard router / proxy layer — middleware routes traffic
Database-managed routing — supported in some distributed databases

Typical flow:

Determine shard based on key
Open connection to correct shard
Execute query only on that shard

Some operations (like analytics or global searches) may require fan-out queries across multiple shards — something to minimize when possible.

What Happens to Transactions and Joins?

Sharding changes how you design queries and transactions.

Local transactions (within one shard) behave normally
Cross-shard transactions are harder and often avoided
Joins across shards usually must be handled in application logic
Global constraints (e.g., unique email across all users) require coordination

Sharded systems typically embrace:

denormalization
async workflows
eventually consistent patterns

Resharding: What Happens When a Shard Fills Up

Eventually, shards may become unbalanced or outgrow capacity. Resharding involves:

splitting a shard into two
moving data to new nodes
updating routing rules
ensuring traffic continues safely during migration

This is one of the most operationally complex parts of sharding — which is why good upfront shard-key design matters so much.

When You Should (and Shouldn’t) Shard

Sharding makes sense when:

data volume is extremely large
write throughput is the main bottleneck
vertical scaling and replicas are exhausted
the business is growing rapidly at scale

Avoid sharding too early if:

a single node still handles the workload
indexing, caching, or schema tuning can solve the problem
operational maturity isn’t there yet

Sharding introduces complexity — it should be a last-stage scaling solution, not the first.

Summary

Database sharding works by splitting a large dataset into smaller, independent shards, each hosted on its own database instance. This improves scalability, storage capacity, and throughput — but introduces new challenges around routing, joins, transactions, and operations. Choosing the right shard key and sharding strategy is critical, and sharding should generally be adopted only when simpler scaling options are no longer sufficient.