Re: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump

Поиск
Список
Период
Сортировка
От Greg Nancarrow
Тема Re: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump
Дата
Msg-id CAJcOf-cNLhA7iaUYAQqZ44tz3oHJoPxGRm1+tNE27iJXTXObzQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump  (Pavel Borisov <pashkin.elfe@gmail.com>)
Список pgsql-hackers
On Mon, May 24, 2021 at 2:50 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, May 24, 2021 at 12:04:37PM +1000, Greg Nancarrow wrote:
> > Keep cfbot happy, use the PG14 patch as latest.
>
> This stuff is usually very tricky.

Agreed. That's why I was looking for experts in this snapshot-handling
code, to look closer at this issue, check my proposed fix, come up
with a better solution etc.

>Do we have a way to reliably
> reproduce the report discussed here?

I couldn't reproduce it in my environment (though I could understand
what was going wrong, based on the description provided).
houzj (houzj.fnst@fujitsu.com) was able to reproduce it in his
environment and kindly provided to me the following information:
(He said that he followed most of the steps described by the original
problem reporter, Pengcheng, but perhaps steps 2 and 7 are a little
different from his steps. See the emails higher in the thread for the
two scripts "init_test.sql" and "sub_120.sql")

===

1, Modify and adjust NUM_SUBTRANS_BUFFERS to 128 from 32 in the file
"src/include/access/subtrans.h" line number 15.
2, configure with enable assert and build it.( ./configure
--enable-cassert --prefix=/home/pgsql)
3, init a new database cluster.
4, modify  postgres.conf  and add some parameters as below. As the
coredump from parallel scan, so we adjust parallel setting, make it
easy to reproduce.

  max_connections = 2000

  parallel_setup_cost=0
  parallel_tuple_cost=0
  min_parallel_table_scan_size=0
  max_parallel_workers_per_gather=8
  max_parallel_workers = 32

5, start the database cluster.
6, use the script init_test.sql  in attachment to create tables.
7, use pgbench with script sub_120.sql in attachment to test it. Try
it sometimes, you should get the coredump file.
    pgbench  -d postgres -p 33550  -n -r -f sub_120.sql   -c 200 -j 200 -T 12000
   (If cannot reproduce it, maybe you can try run two parallel pgbench
xx at the same time)

In my environment(CentOS 8.2, 128G RAM, 40 processors, disk SAS
Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz),
sometimes I can reproduce in about 5 minutes , but sometimes it needs
about half an hour.

Best regards,
houzj

===

Regards,
Greg Nancarrow
Fujitsu Australia



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "houzj.fnst@fujitsu.com"
Дата:
Сообщение: RE: Parallel INSERT SELECT take 2
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: [PATCH] Add `truncate` option to subscription commands