Wednesday, June 14, 2017

ALTERing partition bounds of a partition

PostgreSQL 10 is full with a lot of big, new and exciting features. Declarative partitioning is one of those. It is something users have wanted for years. During PGCon 2017, it was a hot topic of discussion. People wanted to know more about the feature, and were eager to try it out. The un-conference and conference session on partitioning attracted a large crowd. One of the frequently asked questions centred on whether a user can change partition bounds of an existing partition. This bears exploring, and an answer.

Implicit to the question is the use of the ALTER TABLE command. For those who are new to PostgreSQL or to PostgreSQL partitioning, partitions are tables in PostgreSQL. Database administrators use the CREATE TABLE command to create partitions using the PARTITION OF clause. Essentially, users then expect an ALTER TABLE subcommand that allows a change to the partition bounds e.g. ALTER TABLE ... ALTER FOR VALUES ... or a similar kind of command. But, there's no such ALTER TABLE subcommand in PostgreSQL v10. We may add it in the future versions, but we have not seen any such proposal yet.

DBAs planning to use partitioning features to be  introduced in PostgreSQL v10 should not do so lightly. First, bad partitioning is worse than no partitioning. It's critical to choose partitioning keys, strategy, and ranges/lists after significant thought and testing. Second, there are many missing functionalities in v10 partitioning like SPLIT, MERGE. Like many other major features in PostgreSQL, partitioning will take a few (or possibly, just a couple) releases to be functionally complete or close to complete. However, the above functional deficiency is not hard to overcome, and what follows is how to do it.

Let's create a partitioned table with three partitions:

CREATE TABLE t1 (a int, b int) PARTITION BY RANGE(a);
CREATE TABLE t1p1 PARTITION OF t1 FOR VALUES FROM (0) TO (100);
CREATE TABLE t1p2 PARTITION OF t1 FOR VALUES FROM (200) TO (300);
CREATE TABLE t1p3 PARTITION OF t1 FOR VALUES FROM (300) TO (400);

Let's see how has that come out:

\d+ t1
                                   Table "public.t1"
Column |  Type   | Collation | Nullable | Default | Storage | Stats target 
--------+---------+-----------+----------+---------+---------+--------------
a      | integer |           |          |         | plain   |              
b      | integer |           |          |         | plain   |              
Partition key: RANGE (a)
Partitions: t1p1 FOR VALUES FROM (0) TO (100),
           t1p2 FOR VALUES FROM (200) TO (300),
           t1p3 FOR VALUES FROM (300) TO (400)

You will notice that we do not have a partition to hold rows with values for column "a" from 100 (inclusive) to 200 (exclusive). So if you try to insert a row with a = 150, it will fail with an error.

INSERT INTO t1 VALUES (150, 150);
ERROR:  no partition of relation "t1" found for row
DETAIL:  Partition key of the failing row contains (a) = (150).

Let's say you realize your mistake and want to correct it. Here's simple trick. Detach the partition that needs its bounds changed. This will simply detach the corresponding table from the partition hierarchy but does not remove it from the database. Therefore, the data in this partition remains untouched. Now attach that partition again with the changed bounds. If any data in the partition does not fit the new bounds, or some other partition has overlapping bounds, the command will fail. As a result, you have little concern of mistakes. Here are the actual commands (See ALTER TABLE … ATTACH/DETACH documentation for more details.):

BEGIN TRANSACTION;
ALTER TABLE t1 DETACH PARTITION t1p1;
ALTER TABLE t1 ATTACH PARTITION t1p1 FOR VALUES FROM (0) TO (200);
COMMIT TRANSACTION;

If you noticed and question the BEGIN/COMMIT transaction block around the above commands, that’s to ensure that the table remains inaccessible while the bounds are being changed. This is to prevent another transaction from adding a partition with a conflicting range or adding something to t1p1 which would conflict with the new partition bounds. Please note that ATTACH PARTITION would check that all the rows in the table comply with the new partitioning constraints. This would cause the whole table to be scanned and thus affect performance if there are many rows in that table.

Here’s how the new partitions look like. Notice the range of partition t1p1.
\d+ t1
                                   Table "public.t1"
Column |  Type   | Collation | Nullable | Default | Storage | Stats target
--------+---------+-----------+----------+---------+---------+--------------
a      | integer |           |          |         | plain   |          
b      | integer |           |          |         | plain   |          
Partition key: RANGE (a)
Partitions: t1p1 FOR VALUES FROM (0) TO (200),
           t1p2 FOR VALUES FROM (200) TO (300),
           t1p3 FOR VALUES FROM (300) TO (400)

The table will accept a row with a = 150.
INSERT INTO t1 VALUES (150, 150);
INSERT 0 1

The blog was first published on The EDB blogs.

No comments:

Post a Comment