Обсуждение: BUG #16176: NULL value returned by category_sql argument to crosstab() causes segmentation fault

Поиск
Список
Период
Сортировка

BUG #16176: NULL value returned by category_sql argument to crosstab() causes segmentation fault

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      16176
Logged by:          Ireneusz Pluta
Email address:      ipluta@wp.pl
PostgreSQL version: 12.1
Operating system:   FreeBSD 12.1
Description:

The testcase:

    $ psql -Xa service=test < testcase.sql 
    \pset null <NULL>
    Null display is "<NULL>".
    select version();

version
  


------------------------------------------------------------------------------------------------------------------------------------------------------
     PostgreSQL 12.1 on x86_64-unknown-freebsd12.1, compiled by FreeBSD
clang version 8.0.1 (tags/RELEASE_801/final 366581) (based on LLVM 8.0.1),
64-bit
    (1 row)
    ​
    create extension if not exists tablefunc;
    NOTICE:  extension "tablefunc" already exists, skipping
    CREATE EXTENSION
    -- this will not fail or crash:
    select * from crosstab ('values (1, 2, 12), (1, 3, 13), (2, 2, 22), (2,
3, 23), (3, 2, 32), (3, 3, 33) order by 1, 2, 3', 'values (2), (3)') as rr
(a int, "2" int, "3" int);
     a | 2  | 3  
    ---+----+----
     1 | 12 | 13
     2 | 22 | 23
     3 | 32 | 33
    (3 rows)
    ​
    -- this will crash pg12, but not version() < 12 - note NULL value in
categories query:
    select * from crosstab ('values (1, 2, 12), (1, 3, 13), (2, 2, 22), (2,
3, 23), (3, 2, 32), (3, 3, 33) order by 1, 2, 3', 'values (2), (null)') as
rr (a int, "2" int, "3" int);
    server closed the connection unexpectedly
            This probably means the server terminated abnormally
            before or while processing the request.
    connection to server was lost

Backtrace:


    Attaching to process 64160
    Reading symbols from /usr/local/pgsql121/bin/postgres...
    Reading symbols from /usr/local/lib/libxml2.so.2...
    (No debugging symbols found in /usr/local/lib/libxml2.so.2)
    Reading symbols from /usr/lib/libssl.so.111...
    Reading symbols from /usr/lib/debug//usr/lib/libssl.so.111.debug...
    Reading symbols from /lib/libcrypto.so.111...
    Reading symbols from /usr/lib/debug//lib/libcrypto.so.111.debug...
    Reading symbols from /lib/libm.so.5...
    Reading symbols from /usr/lib/debug//lib/libm.so.5.debug...
    Reading symbols from /lib/libc.so.7...
    Reading symbols from /usr/lib/debug//lib/libc.so.7.debug...
    Reading symbols from /lib/libz.so.6...
    Reading symbols from /usr/lib/debug//lib/libz.so.6.debug...
    Reading symbols from /usr/lib/liblzma.so.5...
    Reading symbols from /usr/lib/debug//usr/lib/liblzma.so.5.debug...
    Reading symbols from /lib/libthr.so.3...
    Reading symbols from /usr/lib/debug//lib/libthr.so.3.debug...
    Reading symbols from /usr/local/pgsql121/lib/pg_stat_statements.so...
    Reading symbols from /libexec/ld-elf.so.1...
    Reading symbols from /usr/lib/debug//libexec/ld-elf.so.1.debug...
    [Switching to LWP 103151 of process 64160]
    _poll () at _poll.S:3
    3       _poll.S: No such file or directory.
    (gdb) set pagination off
    (gdb) set logging file crosstab.txt
    (gdb) set logging on 
    Copying output to crosstab.txt.
    (gdb) cont
    Continuing.
    ​
    Program received signal SIGSEGV, Segmentation fault.
    strlen (str=0x0) at /usr/src/lib/libc/string/strlen.c:101
    warning: Source file is more recent than executable.
    101             va = (*lp - mask01);
    (gdb) bt
    #0  strlen (str=0x0) at /usr/src/lib/libc/string/strlen.c:101
    #1  0x0000000000932f2c in dopr (target=0x7fffffffd5a0,
format=0x9115975fc "%s", args=0x7fffffffd580) at snprintf.c:443
    #2  0x00000000009355f1 in pg_vsnprintf (str=<optimized out>,
count=<optimized out>, fmt=0x9115975fc "%s", args=0x3000000020) at
snprintf.c:195
    #3  pg_snprintf (str=<optimized out>, count=<optimized out>,
fmt=0x9115975fc "%s") at snprintf.c:208
    #4  0x0000000911598bca in load_categories_hash (cats_sql=<optimized
out>, per_query_ctx=<optimized out>) at tablefunc.c:774
    #5  crosstab_hash (fcinfo=<optimized out>) at tablefunc.c:677
    #6  0x00000000006603d7 in ExecMakeTableFunctionResult
(setexpr=<optimized out>, econtext=0x91151e460, argContext=<optimized out>,
expectedDesc=0x91151f020, randomAccess=<optimized out>) at execSRF.c:233
    #7  0x000000000066f795 in FunctionNext (node=0x91151e350) at
nodeFunctionscan.c:95
    #8  0x000000000065834d in ExecProcNode (node=<optimized out>) at
../../../src/include/executor/executor.h:239
    #9  ExecutePlan (estate=<optimized out>, planstate=0x91151e350,
operation=<optimized out>, numberTuples=<optimized out>,
direction=<optimized out>, dest=0x911591ae8, use_parallel_mode=<optimized
out>, sendTuples=<optimized out>, execute_once=<optimized out>) at
execMain.c:1646
    #10 standard_ExecutorRun (queryDesc=0x801aa1910, direction=<optimized
out>, count=0, execute_once=<optimized out>) at execMain.c:364
    #11 0x0000000801aea9e0 in pgss_ExecutorRun (queryDesc=0x801aa1910,
direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at
pg_stat_statements.c:893
    #12 0x00000000007bf8f8 in PortalRunSelect (portal=0x911441110,
forward=<optimized out>, count=0, dest=<optimized out>) at pquery.c:929
    #13 0x00000000007bf52d in PortalRun (portal=0x911441110,
count=9223372036854775807, isTopLevel=<optimized out>, run_once=<optimized
out>, dest=0x911591ae8, altdest=0x911591ae8, completionTag=0x7fffffffdb30
"") at pquery.c:770
    #14 0x00000000007be607 in exec_simple_query (query_string=0x8013dc110
"select * from crosstab ('values (1, 2, 12), (1, 3, 13), (2, 2, 22), (2, 3,
23), (3, 2, 32), (3, 3, 33) order by 1, 2, 3', 'values (2), (null)') as rr
(a int, \"2\" int, \"3\" int);") at postgres.c:1215
    #15 0x00000000007bc2fa in PostgresMain (argc=<optimized out>,
argv=<optimized out>, dbname=<optimized out>, username=<optimized out>) at
postgres.c:4232
    #16 0x00000000007381a2 in BackendRun (port=0x91140c000) at
postmaster.c:4437
    #17 0x00000000007378ba in BackendStartup (port=0x91140c000) at
postmaster.c:4128
    #18 ServerLoop () at postmaster.c:1704
    #19 0x0000000000734b42 in PostmasterMain (argc=3, argv=0x7fffffffe8b0)
at postmaster.c:1377
    #20 0x00000000006a57ee in main (argc=3, argv=0x7fffffffe8b0) at
main.c:228
    (gdb)


PG Bug reporting form <noreply@postgresql.org> writes:
>     -- this will crash pg12, but not version() < 12 - note NULL value in
> categories query:
>     select * from crosstab ('values (1, 2, 12), (1, 3, 13), (2, 2, 22), (2,
> 3, 23), (3, 2, 32), (3, 3, 33) order by 1, 2, 3', 'values (2), (null)') as
> rr (a int, "2" int, "3" int);

Yeah, duplicated here.  It seems tablefunc is passing a NULL string
pointer to sprintf().  Previously, that would work on some platforms
and crash on others --- as of v12, it crashes everywhere.

What we need here is to figure out what the hashtab string key ought
to be for a NULL category.  I suspect what was happening before
(on non-crashing platforms) was that you implicitly got "(nil)"
or some spelling like that, which is not great because it could
conflict with a valid user key.  Alternatively, we could refuse
NULL category values ... not sure if that's desirable.

Anyway, this is *not* a new bug, it's just possible to hit it on
more platforms now.

            regards, tom lane



Re: BUG #16176: NULL value returned by category_sql argument tocrosstab() causes segmentation fault

От
Joe Conway
Дата:
On 12/20/19 1:00 PM, Tom Lane wrote:
> PG Bug reporting form <noreply@postgresql.org> writes:
>>     -- this will crash pg12, but not version() < 12 - note NULL value in
>> categories query:
>>     select * from crosstab ('values (1, 2, 12), (1, 3, 13), (2, 2, 22), (2,
>> 3, 23), (3, 2, 32), (3, 3, 33) order by 1, 2, 3', 'values (2), (null)') as
>> rr (a int, "2" int, "3" int);
>
> Yeah, duplicated here.  It seems tablefunc is passing a NULL string
> pointer to sprintf().  Previously, that would work on some platforms
> and crash on others --- as of v12, it crashes everywhere.
>
> What we need here is to figure out what the hashtab string key ought
> to be for a NULL category.  I suspect what was happening before
> (on non-crashing platforms) was that you implicitly got "(nil)"
> or some spelling like that, which is not great because it could
> conflict with a valid user key.  Alternatively, we could refuse
> NULL category values ... not sure if that's desirable.
>
> Anyway, this is *not* a new bug, it's just possible to hit it on
> more platforms now.

Thanks for the heads up -- I'll take a look as quick as I can, but
likely tomorrow.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Вложения

Re: BUG #16176: NULL value returned by category_sql argument tocrosstab() causes segmentation fault

От
Joe Conway
Дата:
On 12/20/19 1:00 PM, Tom Lane wrote:
> PG Bug reporting form <noreply@postgresql.org> writes:
>>     -- this will crash pg12, but not version() < 12 - note NULL value in
>> categories query:
>>     select * from crosstab ('values (1, 2, 12), (1, 3, 13), (2, 2, 22), (2,
>> 3, 23), (3, 2, 32), (3, 3, 33) order by 1, 2, 3', 'values (2), (null)') as
>> rr (a int, "2" int, "3" int);
>
> Yeah, duplicated here.  It seems tablefunc is passing a NULL string
> pointer to sprintf().  Previously, that would work on some platforms
> and crash on others --- as of v12, it crashes everywhere.
>
> What we need here is to figure out what the hashtab string key ought
> to be for a NULL category.  I suspect what was happening before
> (on non-crashing platforms) was that you implicitly got "(nil)"
> or some spelling like that, which is not great because it could
> conflict with a valid user key.  Alternatively, we could refuse
> NULL category values ... not sure if that's desirable.
>
> Anyway, this is *not* a new bug, it's just possible to hit it on
> more platforms now.

It appears that in pg11 (and presumably prior) when snprintf() is called
it is resolved (here at least )to __GI___snprintf() which comes directly
from libc. On my desktop machine the system snprintf() deals with a null
pointer argument without crashing. I guess this is why the crash was
platform dependent.

In pg12 (and presumably master), it is resolved to our own port function
pg_snprintf(), which in turn works its way to dopr(), where strlen() is
called on a null pointer and "<boom>". Undoubtedly strlen() is a bit
more consistent than snprintf() across platforms in this regard.

From what I can see, even on pg11 and prior, having a null category
never did anything useful. And in the 16 years or so since this has been
around, no one in my memory ever asked for that functionality, so I am
inclined to refuse NULL category values unless someone wants to make a
good case otherwise.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Вложения
Joe Conway <mail@joeconway.com> writes:
> It appears that in pg11 (and presumably prior) when snprintf() is called
> it is resolved (here at least )to __GI___snprintf() which comes directly
> from libc. On my desktop machine the system snprintf() deals with a null
> pointer argument without crashing. I guess this is why the crash was
> platform dependent.

Right, glibc's version of snprintf has produced "(nil)" or "(null)"
or something like that for many years.  I'm not sure if that's true
among the BSDen.  One place where the platform snprintf does *not*
survive this case is Windows.

> In pg12 (and presumably master), it is resolved to our own port function
> pg_snprintf(), which in turn works its way to dopr(), where strlen() is
> called on a null pointer and "<boom>".

Right.  While it would only take a couple more lines of code to act like
glibc does, we intentionally adopted the stricter definition because it
seemed more likely to expose bugs.  Looks like it just did.

> From what I can see, even on pg11 and prior, having a null category
> never did anything useful. And in the 16 years or so since this has been
> around, no one in my memory ever asked for that functionality, so I am
> inclined to refuse NULL category values unless someone wants to make a
> good case otherwise.

WFM, but I've never used crosstab() much so I don't have a good feeling
for significant use-cases.

            regards, tom lane



Re: BUG #16176: NULL value returned by category_sql argument tocrosstab() causes segmentation fault

От
Joe Conway
Дата:
On 12/21/19 10:08 AM, Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
>> It appears that in pg11 (and presumably prior) when snprintf() is called
>> it is resolved (here at least )to __GI___snprintf() which comes directly
>> from libc. On my desktop machine the system snprintf() deals with a null
>> pointer argument without crashing. I guess this is why the crash was
>> platform dependent.
>
> Right, glibc's version of snprintf has produced "(nil)" or "(null)"
> or something like that for many years.  I'm not sure if that's true
> among the BSDen.  One place where the platform snprintf does *not*
> survive this case is Windows.
>
>> In pg12 (and presumably master), it is resolved to our own port function
>> pg_snprintf(), which in turn works its way to dopr(), where strlen() is
>> called on a null pointer and "<boom>".
>
> Right.  While it would only take a couple more lines of code to act like
> glibc does, we intentionally adopted the stricter definition because it
> seemed more likely to expose bugs.  Looks like it just did.
>
>> From what I can see, even on pg11 and prior, having a null category
>> never did anything useful. And in the 16 years or so since this has been
>> around, no one in my memory ever asked for that functionality, so I am
>> inclined to refuse NULL category values unless someone wants to make a
>> good case otherwise.
>
> WFM, but I've never used crosstab() much so I don't have a good feeling
> for significant use-cases.

Pushed that way to all supported branches.

If someone arrives with a compelling use-case, the changes would likely
not be something we would want to back patch, so we can address it
if/when that happens.

Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Вложения
Joe Conway <mail@joeconway.com> writes:
> Pushed that way to all supported branches.

Hmm, why'd you use ERRCODE_SYNTAX_ERROR, and not say
ERRCODE_NULL_VALUE_NOT_ALLOWED?

            regards, tom lane



Re: BUG #16176: NULL value returned by category_sql argument tocrosstab() causes segmentation fault

От
Joe Conway
Дата:
On 12/23/19 1:49 PM, Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
>> Pushed that way to all supported branches.
>
> Hmm, why'd you use ERRCODE_SYNTAX_ERROR, and not say
> ERRCODE_NULL_VALUE_NOT_ALLOWED?

To be consistent with the error just above:
--------------------
/*
 * The provided categories SQL query must always return one column:
 * category - the label or identifier for each column
 */
if (spi_tupdesc->natts != 1)
    ereport(ERROR,
    (errcode(ERRCODE_SYNTAX_ERROR),
     errmsg("provided \"categories\" SQL must " \
    "return 1 column of at least one row")));
--------------------

Also, the argument itself is a SQL statement, and it isn't NULL, it just
produces a NULL value as one (or more) of its output rows. It seems like
ERRCODE_NULL_VALUE_NOT_ALLOWED might be confusing.

But I am not married to ERRCODE_SYNTAX_ERROR if you think it ought to be
changed.

Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Вложения
Joe Conway <mail@joeconway.com> writes:
> On 12/23/19 1:49 PM, Tom Lane wrote:
>> Hmm, why'd you use ERRCODE_SYNTAX_ERROR, and not say
>> ERRCODE_NULL_VALUE_NOT_ALLOWED?

> To be consistent with the error just above:
> ...
> But I am not married to ERRCODE_SYNTAX_ERROR if you think it ought to be
> changed.

Meh.  (a) ERRCODE_SYNTAX_ERROR is awfully generic, and thereby not very
helpful; (b) it seems hard to me to paint the problem here as being any
form of "syntax" error.

And for that matter, "syntax error" is a crappy classification of the
error just above, too.  IMO it ought to fall under "data exception".
We don't seem to have an errcode for "wrong number of columns",
but maybe we should invent one --- I think the same issue arises in
other places.  Or we could just use generic ERRCODE_DATA_EXCEPTION.

            regards, tom lane