Re: fairywren failures

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: fairywren failures
Дата
Msg-id 20191003161752.ylp3ppdry2onhiua@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: fairywren failures  (Andres Freund <andres@anarazel.de>)
Ответы Re: fairywren failures  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi,

On 2019-10-03 08:23:49 -0700, Andres Freund wrote:
> On 2019-10-03 08:18:42 -0700, Andres Freund wrote:
> > This is around where an error is thrown:
> >  -- badly formatted interval
> >  INSERT INTO INTERVAL_TBL (f1) VALUES ('badly formatted interval');
> > -ERROR:  invalid input syntax for type interval: "badly formatted interval"
> > -LINE 1: INSERT INTO INTERVAL_TBL (f1) VALUES ('badly formatted inter...
> > -                                              ^
> >
> > and the error is stack related. So I suspect that setjmp/longjmp might
> > be to blame here, and somehow don't save/restore the stack into a proper
> > state. I don't know enough about mingw/msys/windows to know whether that
> > uses a self-written setjmp or relies on the MS implementation.
> >
> > If you could gather a backtrace it might help us. It's possible that the
> > stack is "just" misaligned or something, we had problems with that
> > before (IIRC valgrind didn't always align stacks correctly for processes
> > that forked from within a signal handler, which then crashed when using
> > instructions with alignment requirements, but only sometimes, because
> > the stack coiuld be aligned).
>
> It seems we're not the only ones hitting this:
> https://rt.perl.org/Public/Bug/Display.html?id=133603
>
> Doesn't look like they've really narrowed it down that much yet.

A few notes:

* As an experiment, it could be worthwhile to try to redefine
  sigsetjmp/longjmp/sigjmp_buf with what
  https://gcc.gnu.org/onlinedocs/gcc/Nonlocal-Gotos.html
  provides, it's apparently a separate implementation from MS crt one.

* Arguably
  "Do not use longjmp to transfer control from a callback routine
  invoked directly or indirectly by Windows code."
  and
  "Do not use longjmp to transfer control out of an interrupt-handling
  routine unless the interrupt is caused by a floating-point
  exception. In this case, a program may return from an interrupt
  handler via longjmp if it first reinitializes the floating-point math
  package by calling _fpreset."

  from https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/longjmp?view=vs-2019

  might be violated by our signal signal emulation on windows. But I've
  not looked into that in detail.

* Any chance you could get the pre-processed source for postgres.c or
  such? I'm kinda wondering if the definition of setjmp() that we get
  includes the returns_twice attribute that gcc wants to see, and
  whether we're picking up the mingw version of longjmp, or the windows
  one.


https://sourceforge.net/p/mingw-w64/mingw-w64/ci/844cb490ab2cc32ac3df5914700564b2e40739d8/tree/mingw-w64-headers/crt/setjmp.h#l31

* It's certainly curious that the failures so far only have happended as
  part of pg_upgradeCheck, rather than the plain regression tests.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Improving on MAX_CONVERSION_GROWTH
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Improving on MAX_CONVERSION_GROWTH