Re: [HACKERS] SIGSEGV in BRIN autosummarize

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: [HACKERS] SIGSEGV in BRIN autosummarize
Дата
Msg-id efefda33-5fd9-0a77-6ae5-ca21dbd163aa@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: [HACKERS] SIGSEGV in BRIN autosummarize  (Justin Pryzby <pryzby@telsasoft.com>)
Ответы Re: [HACKERS] SIGSEGV in BRIN autosummarize  (Justin Pryzby <pryzby@telsasoft.com>)
Список pgsql-hackers
Hi,

On 10/15/2017 03:56 AM, Justin Pryzby wrote:
> On Fri, Oct 13, 2017 at 10:57:32PM -0500, Justin Pryzby wrote:
...
>> It's a bit difficult to guess what went wrong from this backtrace. For
>> me gdb typically prints a bunch of lines immediately before the frames,
>> explaining what went wrong - not sure why it's missing here.
> 
> Do you mean this ?
> 
> ...
> Loaded symbols for /lib64/libnss_files-2.12.so
> Core was generated by `postgres: autovacuum worker process   gtt             '.
> Program terminated with signal 11, Segmentation fault.
> #0  pfree (pointer=0x298c740) at mcxt.c:954
> 954             (*context->methods->free_p) (context, pointer);
> 

Yes. So either 'context' is bogus. Or 'context->methods' is bogus. Or
'context->methods->free_p' is bogus.

>> Perhaps some of those pointers are bogus, the memory was already pfree-d
>> or something like that. You'll have to poke around and try dereferencing
>> the pointers to find what works and what does not.
>>
>> For example what do these gdb commands do in the #0 frame?
>>
>> (gdb) p *(MemoryContext)context
> 
> (gdb) p *(MemoryContext)context
> Cannot access memory at address 0x7474617261763a20
> 

OK, this means the memory context pointer (tracked in the header of a
chunk) is bogus. There are multiple common ways how that could happen:

* Something corrupts memory (typically out-of-bounds write).

* The pointer got allocated in an incorrect memory context (which then
was released, and the memory was reused for new stuff).

* It's a use-after-free.

* ... various other possibilities ...

> 
> I uploaded the corefile:
> http://telsasoft.com/tmp/coredump-postgres-autovacuum-brin-summarize.gz
> 

Thanks, but I'm not sure that'll help, at this point. We already know
what happened (corrupted memory), we don't know "how". And core files
are mostly just "snapshots" so are not very useful in answering that :-(

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: [HACKERS] oversight in EphemeralNamedRelation support
Следующее
От: Vik Fearing
Дата:
Сообщение: Re: [HACKERS] [PATCH] pageinspect function to decode infomasks