DROP DATABASE vs patch to not remove files right away

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	DROP DATABASE vs patch to not remove files right away
Дата	15 апреля 2008 г. 23:03:24
Msg-id	18026.1208300591@sss.pgh.pa.us обсуждение исходный текст
Ответы	Re: DROP DATABASE vs patch to not remove files right away (Alvaro Herrera <alvherre@commandprompt.com>) Re: DROP DATABASE vs patch to not remove files right away (Heikki Linnakangas <heikki@enterprisedb.com>) Re: DROP DATABASE vs patch to not remove files right away (Heikki Linnakangas <heikki@enterprisedb.com>)
Список	pgsql-hackers

Дерево обсуждения

Over the last couple days I twice saw complaints like this during
DROP DATABASE:

WARNING: could not remove file or directory "base/80750/80825": No such file or directory
WARNING: could not remove database directory "base/80750"

I poked at it for awhile and was eventually able to extract a
repeatable test case:

while true
do psql -c "create database foo;" postgres || exit 1 psql -c "create table foo(f1 int primary key);" foo || exit 1
psql -c "drop table foo;" foo || exit 1 psql -c "checkpoint" postgres & psql -c "drop database foo;" postgres ||
exit1

done

On my machine this fairly consistently draws warnings in both 8.3 and
HEAD. I believe what is happening is that the bgwriter has a
PendingUnlinkEntry for table foo, and completion of the checkpoint
prompts it to exercise that. Meanwhile in the DROP DATABASE, rmtree is
working through a list of files to drop, and when it hits the
already-deleted one it complains --- and not only does it complain,
it stops trying to delete any more. (The second WARNING is quite
misleading, because what it really means is "I stopped trying".)

Without the CHECKPOINT, what we get instead is that each cycle builds up
some more PendingUnlinkEntrys, which will all fail when the checkpoint
comes. The bgwriter is coded to not report ENOENT, so you don't see any
evidence of that, but it's clearly a possible case and the comment
saying it shouldn't happen is misleading.

Actually ... what if the same DB OID and relfilenode get re-made before
the checkpoint? Then we'd be unlinking live data. This is improbable
but hardly less so than the scenario the PendingUnlinkEntry code was
put in to prevent.

ISTM that we must fix the bgwriter so that ForgetDatabaseFsyncRequests
causes PendingUnlinkEntrys for the doomed DB to be thrown away too.
This should prevent the unlink-live-data scenario, I think.
Even then, concurrent deletion attempts are probably possible (since
ForgetDatabaseFsyncRequests is asynchronous) and rmtree() is being far
too fragile about dealing with them. I think that it should be coded
to ignore ENOENT the same as the bgwriter does, and that it should press
on and keep trying to delete things even if it gets a failure.

Thoughts?
regards, tom lane

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Tom Lane
Дата: 15 апреля 2008 г., 21:55:44
Сообщение: Re: pg_terminate_backend() issues

Следующее

От: Gregory Stark
Дата: 15 апреля 2008 г., 23:08:13
Сообщение: Re: pg_terminate_backend() idea

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

DROP DATABASE vs patch to not remove files right away

Предыдущее

Следующее