Обсуждение: Segfault when running postgres inside kubernetes with huge pages

Поиск
Список
Период
Сортировка

Segfault when running postgres inside kubernetes with huge pages

От
Siegfried Kiermayer
Дата:
Hi,

we are using zalando postgres operator and i changed / set huge pages
on kubernetes nodes from something undefined to 1536 (undefined
because i was pretty sure before changing it to 1536 i saw an initial
value of 1024 with 670 in use.

Postgres stoped working after  setting it to 1536 and restarting the
node. I was scratching my head why because i did saw huge pages before
and didn't connect it at all.

I found core dumps and this is the output:


Core was generated by `/usr/lib/postgresql/14/bin/postgres -D
/home/postgres/pgdata/pgroot/data --conf'.
Program terminated with signal SIGBUS, Bus error.

warning: Section `.reg-xstate/999' in core file too small.
#0  0x0000558ea5345148 in PGSharedMemoryCreate ()
(gdb) bt
#0  0x0000558ea5345148 in PGSharedMemoryCreate ()
#1  0x0000558ea53c157f in CreateSharedMemoryAndSemaphores ()
#2  0x0000558ea5357240 in PostmasterMain ()
#3  0x0000558ea506777a in main ()


This gave me the first indication that it is related to huge pages
setting on the node itself.

I would go into more detail but honestly I believe this might be easy
to find and I also assume it shouldn't segfault but return an error
message indicating the / a issue.

I'm aware that huge pages and other normal features like swap are not
normal inside kubernetes but fyi in kubernetes 1.28 there will be huge
pages support https://kubernetes.io/docs/tasks/manage-hugepages/scheduling-hugepages/

Thanks,

Sigi



Re: Segfault when running postgres inside kubernetes with huge pages

От
Tom Lane
Дата:
Siegfried Kiermayer <sicaine@gmail.com> writes:
> we are using zalando postgres operator and i changed / set huge pages
> on kubernetes nodes from something undefined to 1536 (undefined
> because i was pretty sure before changing it to 1536 i saw an initial
> value of 1024 with 670 in use.

This previous discussion might hold the clue:

https://www.postgresql.org/message-id/CAFpoUr1ggmGs8qpoKvYxNBO3h-T-n%2BMNh%2BJnLRYsYhHurVOiGQ%40mail.gmail.com

> I would go into more detail but honestly I believe this might be easy
> to find and I also assume it shouldn't segfault but return an error
> message indicating the / a issue.

There is not that much we can do about operating system bugs.

            regards, tom lane



Re: Segfault when running postgres inside kubernetes with huge pages

От
Siegfried Kiermayer
Дата:
Hi,

we do run kernel 5.8 and the allocation happens basically at start.

I would still expect postgres to fail gracefully at this point?

Is 'throwing an error message' / checking the allocation a performance
issue? is it in a generic hotpath for allocation?

Tx,

Sigi


On Wed, 8 Nov 2023 at 15:57, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Siegfried Kiermayer <sicaine@gmail.com> writes:
> > we are using zalando postgres operator and i changed / set huge pages
> > on kubernetes nodes from something undefined to 1536 (undefined
> > because i was pretty sure before changing it to 1536 i saw an initial
> > value of 1024 with 670 in use.
>
> This previous discussion might hold the clue:
>
> https://www.postgresql.org/message-id/CAFpoUr1ggmGs8qpoKvYxNBO3h-T-n%2BMNh%2BJnLRYsYhHurVOiGQ%40mail.gmail.com
>
> > I would go into more detail but honestly I believe this might be easy
> > to find and I also assume it shouldn't segfault but return an error
> > message indicating the / a issue.
>
> There is not that much we can do about operating system bugs.
>
>                         regards, tom lane



Re: Segfault when running postgres inside kubernetes with huge pages

От
Andres Freund
Дата:
Hi,

On 2023-11-08 16:03:53 +0100, Siegfried Kiermayer wrote:
> we do run kernel 5.8 and the allocation happens basically at start.
> 
> I would still expect postgres to fail gracefully at this point?
> 
> Is 'throwing an error message' / checking the allocation a performance
> issue? is it in a generic hotpath for allocation?

It's not like we're ignoring an error and just continuing - we're successfully
allocating the memory. Then the kernel sends SIGBUS when accessing the freshly
allocated memory.

We could try to install a SIGBUS handler and erroring out that way. But doing
that correctly and portably is not exactly trivial.

Greetings,

Andres Freund