Database sharding is a technique used to horizontally scale a database by splitting (or partitioning) large datasets into smaller, faster, and more manageable parts called shards.
Each shard is an independent database that holds a subset of the overall data. Together, all the shards make up the complete dataset.
How It Works:
- Instead of storing all your data in one huge table in one database,
- You split the data across multiple databases or servers,
- Usually based on a sharding key (e.g.,
user_id
,region
,order_id
).
For example:
User ID | Shard Location |
---|---|
1–1000 | Database A |
1001–2000 | Database B |
2001–3000 | Database C |
Benefits of Sharding
- Scalability: Each shard can be hosted on a different server, allowing horizontal scaling.
- Performance: Smaller datasets in each shard lead to faster queries.
- Availability: Even if one shard fails, others can continue to operate.