Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb

Поиск

Список

Период

Сортировка

От	Amit Langote
Тема	Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb
Дата	22 марта 2021 г. 09:32:52
Msg-id	CA+HiwqE11ZkfAMVhV3KPAeLf5d9vru-yc6CY-u1WOFBtXKdhfg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb (Amul Sul <sulamul@gmail.com>)
Ответы	Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb
Список	pgsql-hackers

Дерево обсуждения

On Mon, Mar 22, 2021 at 5:26 PM Amul Sul <sulamul@gmail.com> wrote:
> In heapam_relation_copy_for_cluster(), begin_heap_rewrite() sets
> rwstate->rs_new_rel->rd_smgr correctly but next line tuplesort_begin_cluster()
> get called which cause the system cache invalidation and due to CCA setting,
> wipe out rwstate->rs_new_rel->rd_smgr which wasn't restored for the subsequent
> operations and causes segmentation fault.
>
> By calling RelationOpenSmgr() before calling smgrimmedsync() in
> end_heap_rewrite() would fix the failure. Did the same in the attached patch.

That makes sense.  I see a few commits in the git history adding
RelationOpenSmgr() before a smgr* operation, whenever such a problem
would have been discovered: 4942ee656ac, afa8f1971ae, bf347c60bdd7,
for example.

I do wonder if there are still other smgr* operations in the source
code that are preceded by operations that would invalidate the
SMgrRelation that those smgr* operations would be called with.  For
example, the smgrnblocks() in gistBuildCallback() may get done too
late than a corresponding RelationOpenSmgr() on the index relation.

--
Amit Langote
EDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb