Обсуждение: [PATCH] Support SK_SEARCHNULL / SK_SEARCHNOTNULL for heap-only scans
Hi hackers, A colleague of mine wanted to use a ScanKey with SK_SEARCHNULL flag for a heap-only scan (besides other ScanKeys) and discovered that the result differs from what he would expect. Turned out that this is currently not supported as it is explicitly stated in skey.h. Although several workarounds come to mind this limitation may be really of inconvenience for the extension authors, and implementing corresponding support seems to be pretty straightforward. The attached patch does this. -- Best regards, Aleksander Alekseev
Вложения
Hi, On 2023-02-13 17:59:13 +0300, Aleksander Alekseev wrote: > @@ -36,20 +36,36 @@ HeapKeyTest(HeapTuple tuple, TupleDesc tupdesc, int nkeys, ScanKey keys) > bool isnull; > Datum test; > > - if (cur_key->sk_flags & SK_ISNULL) > - return false; > + if (cur_key->sk_flags & (SK_SEARCHNULL | SK_SEARCHNOTNULL)) > + { > + /* special case: looking for NULL / NOT NULL values */ > + Assert(cur_key->sk_flags & SK_ISNULL); > > - atp = heap_getattr(tuple, cur_key->sk_attno, tupdesc, &isnull); > + atp = heap_getattr(tuple, cur_key->sk_attno, tupdesc, &isnull); > > - if (isnull) > - return false; > + if (isnull && (cur_key->sk_flags & SK_SEARCHNOTNULL)) > + return false; > > - test = FunctionCall2Coll(&cur_key->sk_func, > - cur_key->sk_collation, > - atp, cur_key->sk_argument); > + if (!isnull && (cur_key->sk_flags & SK_SEARCHNULL)) > + return false; Shouldn't need to extract the column if we just want to know if it's NULL (see heap_attisnull()). Afaics the value isn't accessed after this. Greetings, Andres Freund
Re: [PATCH] Support SK_SEARCHNULL / SK_SEARCHNOTNULL for heap-only scans
От
Aleksander Alekseev
Дата:
Hi Andres, > Shouldn't need to extract the column if we just want to know if it's NULL (see > heap_attisnull()). Afaics the value isn't accessed after this. Many thanks. Fixed. -- Best regards, Aleksander Alekseev
Вложения
Re: [PATCH] Support SK_SEARCHNULL / SK_SEARCHNOTNULL for heap-only scans
От
Heikki Linnakangas
Дата:
On 14/02/2023 11:10, Aleksander Alekseev wrote: > Hi Andres, > >> Shouldn't need to extract the column if we just want to know if it's NULL (see >> heap_attisnull()). Afaics the value isn't accessed after this. > > Many thanks. Fixed. I'm confused, what exactly is the benefit of this? What extension performs a direct table scan bypassing the executor, searching for NULLs or not-NULLs? If heapam can check for NULL/not-NULL more efficiently than the code that calls it, sure let's do this, and let's also see the performance test results to show the benefit. But then let's also modify the caller in nodeSeqScan.c to actually make use of it. For tableam extensions, which may or may not support checking for NULLs, we need to add an 'amsearchnulls' field to the table AM API. - Heikki
Re: [PATCH] Support SK_SEARCHNULL / SK_SEARCHNOTNULL for heap-only scans
От
Aleksander Alekseev
Дата:
Hi, > I'm confused, what exactly is the benefit of this? What extension > performs a direct table scan bypassing the executor, searching for NULLs > or not-NULLs? Basically any extension that accesses the tables without SPI in order to avoid parsing and planning overhead for relatively simple cases. One can specify *several* ScanKeys for a single scan which will be an equivalent of WHERE condition(a) AND b IS NOT NULL /* AND ... */; > If heapam can check for NULL/not-NULL more efficiently than the code > that calls it [...] This is done not for efficiency but rather for convenience. Additionally, practice shows that for an extension author it's very easy to miss a comment in skey.h: """ * SK_SEARCHARRAY, SK_SEARCHNULL and SK_SEARCHNOTNULL are supported only * for index scans, not heap scans; """ ... which results in many hours of debugging. The current interface is misleading and counterintuitive. I did my best in order to add as few new assembly instructions as possible, and only one extra if/else branching. I don't expect any measurable performance difference since the bottleneck for SeqScans is unlikely to be CPU in the affected piece of code but rather disk/locks/network/etc. On top of that the scenario when somebody is really worried about the performance AND is using seqscans (not index scans) AND this particular seqscan is a bottleneck (not JOINs, etc) seems rare, to me at least. > For tableam extensions, which may or may not support checking for NULLs, > we need to add an 'amsearchnulls' field to the table AM API. This will result in an unnecessary complication of the code and expensive extra checks that for the default heapam will always return true. I would argue that what we actually want is to force any TAM to support checking for NULLs. At least until somebody working on a real TAM will complain about this limitation. > But then let's also modify the caller in nodeSeqScan.c to actually make use of it. That could actually be a good point. If memory serves I noticed that WHERE ... IS NULL queries don't even hit HeapKeyTest() and I was curious where the check for NULLs is actually made. As I understand, SeqNext() in nodeSeqscan.c simply iterates over all the tuples it can find and pushes them to the parent node. We could get a slightly better performance for certain queries if SeqNext() did the check internally. Unfortunately I'm not very experienced with plan nodes in order to go down this rabbit hole straight away. I suggest we make one change at a time and keep the patchset small as it was previously requested by many people on several occasions (the 64-bit XIDs story, etc). I will be happy to propose a follow-up patch accompanied by the benchmarks if and when we reach the consensus on this patch. -- Best regards, Aleksander Alekseev
Re: [PATCH] Support SK_SEARCHNULL / SK_SEARCHNOTNULL for heap-only scans
От
Heikki Linnakangas
Дата:
On 22/02/2023 15:03, Aleksander Alekseev wrote: > Additionally, practice shows that for an extension author it's very > easy to miss a comment in skey.h: > > """ > * SK_SEARCHARRAY, SK_SEARCHNULL and SK_SEARCHNOTNULL are supported only > * for index scans, not heap scans; > """ > > ... which results in many hours of debugging. The current interface is > misleading and counterintuitive. Perhaps an Assert in heap_beginscan would be in order, to check that none of those flags are set. >> But then let's also modify the caller in nodeSeqScan.c to actually make use of it. > > That could actually be a good point. > > If memory serves I noticed that WHERE ... IS NULL queries don't even > hit HeapKeyTest() and I was curious where the check for NULLs is > actually made. As I understand, SeqNext() in nodeSeqscan.c simply > iterates over all the tuples it can find and pushes them to the parent > node. We could get a slightly better performance for certain queries > if SeqNext() did the check internally. Right, it might be faster to perform the NULL-checks before checking visibility, for example. Arbitrary quals cannot be evaluated before checking visibility, but NULL checks could be. > Unfortunately I'm not very experienced with plan nodes in order to go > down this rabbit hole straight away. I suggest we make one change at a > time and keep the patchset small as it was previously requested by > many people on several occasions (the 64-bit XIDs story, etc). I will > be happy to propose a follow-up patch accompanied by the benchmarks if > and when we reach the consensus on this patch. Ok, I don't think this patch on its own is a good idea, without the other parts, so I'll mark this as Returned with Feedback in the commitfest. - Heikki
On Mon, Feb 27, 2023 at 12:24 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > On 22/02/2023 15:03, Aleksander Alekseev wrote: > > If memory serves I noticed that WHERE ... IS NULL queries don't even > > hit HeapKeyTest() and I was curious where the check for NULLs is > > actually made. As I understand, SeqNext() in nodeSeqscan.c simply > > iterates over all the tuples it can find and pushes them to the parent > > node. We could get a slightly better performance for certain queries > > if SeqNext() did the check internally. > > Right, it might be faster to perform the NULL-checks before checking > visibility, for example. Arbitrary quals cannot be evaluated before > checking visibility, but NULL checks could be. Hi Heikki, There's quite a bit of work left to do, but I wanted to check if the attached patch (0002, based on top of Aleks' 0001 from upthread) was going in the direction you were thinking. This patch pushes down any forced-null and not-null Vars as ScanKeys. It doesn't remove the redundant quals after turning them into ScanKeys, so it's needlessly inefficient, but there's still a decent speedup for some of the basic benchmarks in 0003. Plans look something like this: # EXPLAIN SELECT * FROM t WHERE i IS NULL; QUERY PLAN ------------------------------------------------------------ Seq Scan on t (cost=0.00..1393.00 rows=49530 width=4) Scan Cond: (i IS NULL) Filter: (i IS NULL) (3 rows) # EXPLAIN SELECT * FROM t WHERE i = 3; QUERY PLAN -------------------------------------------------------- Seq Scan on t (cost=0.00..1643.00 rows=1 width=4) Scan Cond: (i IS NOT NULL) Filter: (i = 3) (3 rows) The non-nullable case worries me a bit because so many things imply IS NOT NULL. I think I need to do some sort of cost analysis using the null_frac statistics -- it probably only makes sense to push an implicit SK_SEARCHNOTNULL down to the AM layer if some fraction of rows would actually be filtered out -- but I'm not really sure how to choose a threshold. It would also be neat if `COUNT(col)` could push down SK_SEARCHNOTNULL, but I think that would require a new support function to rewrite the plan for an aggregate. Am I on the right track? Thanks, --Jacob
Вложения
On 7/19/23 16:44, Jacob Champion wrote: > This patch pushes down any > forced-null and not-null Vars as ScanKeys. It doesn't remove the > redundant quals after turning them into ScanKeys, so it's needlessly > inefficient, but there's still a decent speedup for some of the basic > benchmarks in 0003. > > Plans look something like this: > > # EXPLAIN SELECT * FROM t WHERE i IS NULL; > QUERY PLAN > ------------------------------------------------------------ > Seq Scan on t (cost=0.00..1393.00 rows=49530 width=4) > Scan Cond: (i IS NULL) > Filter: (i IS NULL) > (3 rows) Redundant clauses are now filtered out in v3. So the new plans look more like what you'd expect: =# EXPLAIN SELECT * FROM table1 WHERE a IS NOT NULL AND b = 2; QUERY PLAN --------------------------------------------------------- Seq Scan on table1 (cost=0.00..3344.00 rows=1 width=4) Scan Cond: (a IS NOT NULL) Filter: (b = 2) (3 rows) > The non-nullable case worries me a bit because so many things imply IS > NOT NULL. I think I need to do some sort of cost analysis using the > null_frac statistics -- it probably only makes sense to push an > implicit SK_SEARCHNOTNULL down to the AM layer if some fraction of > rows would actually be filtered out -- but I'm not really sure how to > choose a threshold. v3 also uses the nullfrac and the expected cost of the filter clauses to decide whether to promote a derived IS NOT NULL condition to a ScanKey. (Explicit IS [NOT] NULL clauses are always promoted.) I'm still not sure how to fine-tune the expected cost of the ScanKey, but the number I've chosen for now (`0.1 * cpu_operator_cost`) doesn't seem to regress my benchmarks, for whatever that's worth. I recorded several of the changes to the regression EXPLAIN output, but I've left a few broken because I'm not sure if they're useful or if I should just disable the optimization. For example: explain (analyze, costs off, summary off, timing off) select * from list_part where a = list_part_fn(1) + a; QUERY PLAN ------------------------------------------------------------------ Append (actual rows=0 loops=1) -> Seq Scan on list_part1 list_part_1 (actual rows=0 loops=1) + Scan Cond: (a IS NOT NULL) Filter: (a = (list_part_fn(1) + a)) Rows Removed by Filter: 1 -> Seq Scan on list_part2 list_part_2 (actual rows=0 loops=1) + Scan Cond: (a IS NOT NULL) Filter: (a = (list_part_fn(1) + a)) Rows Removed by Filter: 1 -> Seq Scan on list_part3 list_part_3 (actual rows=0 loops=1) + Scan Cond: (a IS NOT NULL) Filter: (a = (list_part_fn(1) + a)) Rows Removed by Filter: 1 -> Seq Scan on list_part4 list_part_4 (actual rows=0 loops=1) + Scan Cond: (a IS NOT NULL) Filter: (a = (list_part_fn(1) + a)) Rows Removed by Filter: 1 These new conditions are due to a lack of statistics for the tiny partitions; the filters are considered expensive enough that the savings against a DEFAULT_UNK_SEL proportion of NULLs would hypothetically be worth it. Best I can tell, this is a non-issue, since autovacuum will fix the problem by the time the table grows to the point where the pointless ScanKey would cause a slowdown. But it sure _looks_ like a mistake, especially since these particular partitions can't even contain NULL. Do I need to do something about this short-lived case? There's still other work to do -- for instance, I think my modifications to the filter clauses have probably broken some isolation levels until I fix up SeqRecheck(), and better benchmarks would be good -- but I think this is ready for early CF feedback. Thanks, --Jacob