Re: [HACKERS] path toward faster partition pruning
| От | Amit Langote |
|---|---|
| Тема | Re: [HACKERS] path toward faster partition pruning |
| Дата | |
| Msg-id | 0ed58efc-3320-909f-faf8-1413762986b0@lab.ntt.co.jp обсуждение исходный текст |
| Ответ на | Re: [HACKERS] path toward faster partition pruning (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>) |
| Ответы |
Re: [HACKERS] path toward faster partition pruning
|
| Список | pgsql-hackers |
On 2018/04/10 13:27, Ashutosh Bapat wrote: > On Mon, Apr 9, 2018 at 8:56 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Fri, Apr 6, 2018 at 11:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> David Rowley <david.rowley@2ndquadrant.com> writes: >>>> Sounds like you're saying that if we have too many alternative files >>>> then there's a chance that one could pass by luck. >>> >>> Yeah, exactly: it passed, but did it pass for the right reason? >>> >>> If there's just two expected-files, it's likely not a big problem, >>> but if you have a bunch it's something to worry about. >>> >>> I'm also wondering how come we had hash partitioning before and >>> did not have this sort of problem. Is it just that we added a >>> new test that's more sensitive to the details of the hashing >>> (if so, could it be made less so)? Or is there actually more >>> platform dependence now than before (and if so, why is that)? >> >> The existing hash partitioning tests did have some dependencies on the >> hash function, but they took care not to use the built-in hash >> functions. Instead they did stuff like this: >> >> CREATE OR REPLACE FUNCTION hashint4_noop(int4, int8) RETURNS int8 AS >> $$SELECT coalesce($1,0)::int8$$ LANGUAGE sql IMMUTABLE; >> CREATE OPERATOR CLASS test_int4_ops FOR TYPE int4 USING HASH AS >> OPERATOR 1 = , FUNCTION 2 hashint4_noop(int4, int8); >> CREATE TABLE mchash (a int, b text, c jsonb) >> PARTITION BY HASH (a test_int4_ops, b test_text_ops); >> >> I think that this approach should also be used for the new tests. >> Variant expected output files are a pain to maintain, and you >> basically just have to take whatever output you get as the right >> answer, because nobody knows what output a certain built-in hash >> function should produce for a given input except by running the code. >> If you do the kind of thing shown above, though, then you can easily >> see by inspection that you're getting the right answer. Thanks for the idea. I think it makes sense and also agree that alternate outputs approach is not perfectly reliable and maintainable. > +1. Attached find a patch that rewrites hash partition pruning tests that away. It creates two hash operator classes, one for int4 and another for text type and uses them to create hash partitioned table to be used in the tests, like done in the existing tests in hash_part.sql. Since that makes tests (hopefully) reliably return the same result always, I no longer see the need to keep them in a separate partition_prune_hash.sql. The reasoning behind having the separate file was to keep the alternative output file small as David explained in [1]. However, I noticed that there is a bug in RelationBuildPartitionKey that causes a crash when using a SQL function as partition support function as the revised tests do, so please refer to and apply the patches I posted here before running the revised tests: https://www.postgresql.org/message-id/3041e853-b1dd-a0c6-ff21-7cc5633bffd0%40lab.ntt.co.jp Thanks, Amit [1] https://www.postgresql.org/message-id/CAKJS1f-SON_hAekqoV4_WQwJBtJ_rvvSe68jRNhuYcXqQ8PoQg%40mail.gmail.com
Вложения
В списке pgsql-hackers по дате отправления: