Re: Declarative partitioning
От | Amit Langote |
---|---|
Тема | Re: Declarative partitioning |
Дата | |
Msg-id | 573ECCC3.7080104@lab.ntt.co.jp обсуждение исходный текст |
Ответ на | Re: Declarative partitioning (Ildar Musin <i.musin@postgrespro.ru>) |
Ответы |
Re: Declarative partitioning
|
Список | pgsql-hackers |
Hi Ildar, On 2016/05/19 0:36, Ildar Musin wrote: > > Here is an experimental patch that optimizes planning time for range > partitioned tables (it could be considered as a "proof of concept"). Patch > should be applied on top of Amit's declarative partitioning patch. It > handles only a very special case (often used though) where partitioning > key consists of just a single attribute and doesn't contain expressions. Great, thanks! I understand that it's still PoC and the point may be just to consider performance implications of excessive partdesc copying but I'm wondering about a few things about the patch in general. See below. > The main idea is the following: > * we are looking for clauses like 'VAR OP CONST' (where VAR is > partitioning key attribute, OP is a comparison operator); > * using binary search find a partition (X) that fits CONST value; > * based on OP operator determine which partitions are also covered by > clause. There are possible cases: > 1. If OP is '<' or '<=' then we need partitions standing left from X > (including) > 2. If OP is '>' or '>=' then we need partitions standing right from X > (including) > 3. If OP is '=' the we need only X partition > (for '<' and '>' operators we also check if CONST value is equal to a > lower or upper boundary (accordingly) and if it's true then exclude X). > > For boolean expressions we evaluate left and right sides accordingly to > algorithm above and then based on boolean operator find intersection (for > AND) or union (for OR). Perhaps you're already aware but may I also suggest looking at how clauses are matched to indexes? For example, consider how match_clauses_to_index() in src/backend/optimizer/path/indxpath.c works. Moreover, instead of pruning partitions in planner prep phase, might it not be better to do that when considering paths for the (partitioned) rel? IOW, instead of looking at parse->jointree, we should rather be working with rel->baserestrictinfo. Although, that would require some revisions to how append_rel_list, simple_rel_list, etc. are constructed and manipulated in a given planner invocation. Maybe it's time for that... Again, you may have already considered these things. > I run some benchmarks on: > 1. original constraint exclusion mechanism, > 2. optimized version (this patch) and > 3. optimized version using relation->rd_partdesc pointer instead of > RelationGetPartitionDesc() function (see previous discussion). > > Initial conditions: > > CREATE TABLE abc (id SERIAL NOT NULL, a INT, dt TIMESTAMP) PARTITION BY > RANGE (a); > CREATE TABLE abc_1 PARTITION OF abc FOR VALUES START (0) END (1000); > CREATE TABLE abc_2 PARTITION OF abc FOR VALUES START (1000) END (2000); > ... > etc > INSERT INTO %s (a) SELECT generate_series(0, <partitions_count> * 1000); > > pgbench scripts: > https://gist.github.com/zilder/872e634a8eeb405bd045465fc9527e53 (where > :partitions is a number of partitions). > The first script tests fetching a single row from the partitioned table. > Results (tps): > > # of partitions | constraint excl. | optimized | optimized (using > pointer) > ----------------+------------------+---------------+---------------------------- > > 100 | 658 | 2906 | 3079 > 1000 | 45 | 2174 | 3021 > 2000 | 22 | 1667 | 2919 > > > The second script tests fetching all data from a single partition. Results > (tps): > > # of partitions | constraint excl. | optimized | optimized (using > pointer) > ----------------+------------------+---------------+---------------------------- > > 100 | 317 | 1001 | 1051 > 1000 | 34 | 941 | 1023 > 2000 | 15 | 813 | 1016 > > Optimized version works much faster on large amount of partitions and > degradates slower than constraint exclusion. But still there is a > noticeable performance degradation from copying PartitionDesc structure: > with 2000 partitions RelationGetPartitionDesc() function spent more than > 40% of all execution time on copying in first benchmark (measured with > `perf`). Using reference counting as Amit suggests will allow to > significantily decrease performance degradation. Could you try with the attached updated set of patches? I changed partition descriptor relcache code to eliminate excessive copying in previous versions. Thanks, Amit
Вложения
- 0001-Add-syntax-to-specify-partition-key-v5.patch
- 0002-Add-a-IGNORE-dependency-type-v5.patch
- 0003-Infrastructure-for-creation-of-partitioned-tables-v5.patch
- 0004-Add-syntax-to-create-partitions-v5.patch
- 0005-Infrastructure-for-partition-metadata-storage-and-ma-v5.patch
- 0006-Introduce-tuple-routing-for-partitioned-tables-v5.patch
В списке pgsql-hackers по дате отправления: