This is not a comprehensive list of all of the commonly used terms, but rather it is a list of terms that are commonly confused or conflated with other terms. In each section we will identify common phrases, define our specific usage and list external references for the term in question.
For our purposes, when referring to partitioning we are refering to a local optimization of splitting one large table into many small tables. Additionally, this is done in a way such that the application accessing this partitioned table is mostly unaware of the changes. Queries benefit from partitioning only if they use the partitioning key as a filter.
Definition - "Partitioning refers to splitting what is logically one large table into smaller physical pieces." Reference: PostgreSQL: Documentation: 11: 5.10. Table Partitioning
To learn more about the implementation details and performance improvements of partitioning please read the Database Team handbook entry: Database Partitioning
The simplest definition of sharding is the process of breaking up portions of your database (e.g. large tables) into smaller tables and distributing those smaller tables across multiple servers. Where sharding can get confusing, especially with PostgreSQL, is that most definitions of sharding use the term
partitioning to describe how the tables are broken down and distributed. This can lead to the terms
partitioning being conflated.
For our purposes, when referring to sharding we are referring to horizontally distributing data across multiple database clusters ("shards"). Each shard only contains a subset of the overall data.
Another variant of sharding is
application sharding This typically refers to an architecture that implements routing logic to route data requests from the application to the correct data repository. During our discussions in the Sharding Working Group we typically referred to the Shopify Pods Architecture as an example of application sharding.
Foreign Data Wrappers allow for accessing remote data from multiple data sources. FDW can be used to create an integrated view from your PostgreSQL instance into many different types of resources, but for our purposes we are focusing on integrating multiple PostgreSQL servers. Foreign Data Wrappers (FDW) are an implementation detail that can change your PostgreSQL partitioning implementation to what is considered a Built-in Sharding implementation.
With PostgreSQL, partitioning is the first implementation step taken on the path to sharding using Foreign Data Wrappers (FDW). More information and implementation details can be found in the Database Team Handbook Entry: PostgreSQL 11 sharding with foreign data wrappers and partitioning