Обсуждение: Re: [PATCH] Check for TupleTableSlot nullness before dereferencing
> On 3 Oct 2024, at 09:47, Alexander Kuznetsov <kuznetsovam@altlinux.org> wrote: > > Hello everyone, > > I'd like to propose adding check for nullness of > TupleTableSlot before dereferencing it in /src/backend/executor/nodeAgg.c > > It is done in the same manner other TupleTableSlots are checked, > but was probably left unseen because slot1 and slot2 variables > can be swapped during function execution. From a quick reading we can only reach there after evaluating an expression, so can it really be null though? This code hasn't changed all that much since 2009, if there was a reachable segfault on a null pointer deref I have a feeling we'd heard about it by now so some extra care seems warranted to ensure it's not a static analyzer false positive. -- Daniel Gustafsson
03.10.2024 12:48, Daniel Gustafsson wrote: > From a quick reading we can only reach there after evaluating an expression, so > can it really be null though? This code hasn't changed all that much since > 2009, if there was a reachable segfault on a null pointer deref I have a > feeling we'd heard about it by now so some extra care seems warranted to ensure > it's not a static analyzer false positive. Thanks for your response! It seems to me that dereferencing is possible under the following scenario: 1. slot2 is NULL at line 968, 2. The while loop at line 971 executes once, filling slot1 (slot2 still remains NULL), 3. No changes occur to slot2 thereafter, up to line 1003, 4. At line 1003, slot2 is swapped with slot1 (assuming numDistinctCols > 0), 5. At line 1016, slot1 is dereferenced with conent of slot2 (NULL). This entire reasoning is based on the assumption that slot2 can theoretically be NULL, as there is such a check at line 968. Is it possible that no errors have occurred because this condition has always been satisfied and is, perhaps, redundant,or maybe I'm misunderstanding something? -- Best regards, Alexander Kuznetsov
Hello, ping. What do you think about reasoning below? Maybe we should consider proposing different patch for removing redundant check there? 09.10.2024 18:23, Alexander Kuznetsov wrote: > 03.10.2024 12:48, Daniel Gustafsson wrote: >> From a quick reading we can only reach there after evaluating an expression, so >> can it really be null though? This code hasn't changed all that much since >> 2009, if there was a reachable segfault on a null pointer deref I have a >> feeling we'd heard about it by now so some extra care seems warranted to ensure >> it's not a static analyzer false positive. > Thanks for your response! > It seems to me that dereferencing is possible under the following scenario: > [...] > This entire reasoning is based on the assumption that slot2 can theoretically be NULL, as there is such a check at line968. > Is it possible that no errors have occurred because this condition has always been satisfied and is, perhaps, redundant,or maybe I'm misunderstanding something? -- Best regards, Alexander Kuznetsov
В письме от пятница, 13 декабря 2024 г. 11:54:35 MSK пользователь Alexander Kuznetsov написал: > Hello, > > ping. What do you think about reasoning below? Maybe we should consider > proposing different patch for removing redundant check there? Hi! Please, pay attention that commitfest entry for this patch https://commitfest.postgresql.org/patch/5662/ reports problems with windows build. There is a small chance that this is flase alarm, sometimes checkers fails for their own reason. But most probably this is persistent error, and if it is, this problem should be researched first of all, and fixed. Only after that there there can be any discussion if this null-related problem should be fixed or not. > > 09.10.2024 18:23, Alexander Kuznetsov wrote: > > 03.10.2024 12:48, Daniel Gustafsson wrote: > >> From a quick reading we can only reach there after evaluating an > >> expression, so can it really be null though? This code hasn't changed > >> all that much since 2009, if there was a reachable segfault on a null > >> pointer deref I have a feeling we'd heard about it by now so some extra > >> care seems warranted to ensure it's not a static analyzer false > >> positive. > > > > Thanks for your response! > > It seems to me that dereferencing is possible under the following > > scenario: > > [...] > > This entire reasoning is based on the assumption that slot2 can > > theoretically be NULL, as there is such a check at line 968. Is it > > possible that no errors have occurred because this condition has always > > been satisfied and is, perhaps, redundant, or maybe I'm misunderstanding > > something? -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su
Вложения
On Fri, Dec 13, 2024 at 12:54 AM Alexander Kuznetsov <kuznetsovam@altlinux.org> wrote: > ping. What do you think about reasoning below? Maybe we should consider > proposing different patch for removing redundant check there? To move this forward a bit, your reasoning: > 1. slot2 is NULL at line 968, > 2. The while loop at line 971 executes once, filling slot1 (slot2 still remains NULL), > 3. No changes occur to slot2 thereafter, up to line 1003, > 4. At line 1003, slot2 is swapped with slot1 (assuming numDistinctCols > 0), > 5. At line 1016, slot1 is dereferenced with conent of slot2 (NULL). assumes that (numDistinctCols > 0) can be true at the same time that (slot2 == NULL), but it's not clear to me that this can happen in practice. See build_pertrans_for_aggref(), where these are assigned. (A test case producing a NULL dereference would be the fastest way to prove that it can happen; are you able to construct one? groupingsets.sql, in the regression tests, hits this code and could provide a starting point.) Thanks, --Jacob
On Tue, 22 Jul 2025 at 10:20, Jacob Champion <jacob.champion@enterprisedb.com> wrote: > > On Fri, Dec 13, 2024 at 12:54 AM Alexander Kuznetsov > > 1. slot2 is NULL at line 968, > > 2. The while loop at line 971 executes once, filling slot1 (slot2 still remains NULL), > > 3. No changes occur to slot2 thereafter, up to line 1003, > > 4. At line 1003, slot2 is swapped with slot1 (assuming numDistinctCols > 0), > > 5. At line 1016, slot1 is dereferenced with conent of slot2 (NULL). > > assumes that (numDistinctCols > 0) can be true at the same time that > (slot2 == NULL), but it's not clear to me that this can happen in > practice. See build_pertrans_for_aggref(), where these are assigned. I also can't see how this could happen. uniqslot is created in build_pertrans_for_aggref() when "numDistinctCols > 0", and the swap of slot1 and slot2 also only occurs when "numDistinctCols > 0". i.e. the uniqslot will always be allocated when the aggregate has DISTINCT (unless it's a presorted aggregate, in which case process_ordered_aggregate_multi() isn't used). If numDistinctCols > 0 then so is numSortCols > 0 per "numSortCols = numDistinctCols = list_length(sortlist);" in build_pertrans_for_aggref(). The code in this area gets a bit of exercise with: create table ab(a text, b text); insert into ab values('1','1'),('1','1'); set enable_presorted_aggregate=0; select string_agg(distinct a,b) from ab; David