Re: Online enabling of checksums

Поиск

Список

Период

Сортировка

От	Magnus Hagander
Тема	Re: Online enabling of checksums
Дата	23 февраля 2018 г. 02:28:31
Msg-id	CABUevEwswbOb4+tPaS02AqKmy7mj2DP+i4GDxYU9Vt12-t=kXg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Online enabling of checksums (Andres Freund <andres@anarazel.de>)
Ответы	Re: Online enabling of checksums (Robert Haas <robertmhaas@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On Thu, Feb 22, 2018 at 9:23 PM, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2018-02-22 21:16:02 +0100, Magnus Hagander wrote:
> You could do that, but then you've moving the complexity to managing that
> list in shared memory instead.

Maybe I'm missing something, but how are you going to get quick parallel
processing if you don't have a shmem piece? You can't assign one
database per worker because commonly there's only one database. You
don't want to start/stop a worker for each relation because that'd be
extremely slow for databases with a lot of tables. Without shmem you
can't pass more than an oid to a bgworker. To me the combination of
these things imply that you need some other synchronization mechanism
*anyway*.

Yes, you probably need something like that if you want to be able to parallelize on things inside each database. If you are OK parallelizing things on a per-database level, you don't need it.

> I'm not sure that's any easier... And
> certainly adding a catalog flag for a usecase like this one is not making
> it easier.

Hm, I imagined you'd need that anyway. Imagine a 10TB database that's
online converted to checksums. I assume you'd not want to reread 9TB if
you crash after processing most of the cluster already?

I would prefer that yes. But having to re-read 9TB is still significantly better than not being able to turn on checksums at all (state today). And adding a catalog column for it will carry the cost of the migration *forever*, both for clusters that never have checksums and those that had it from the beginning.

Accepting that the process will start over (but only read, not re-write, the blocks that have already been processed) in case of a crash does significantly simplify the process, and reduce the long-term cost of it in the form of entries in the catalogs. Since this is a on-time operation (or for many people, a zero-time operation), paying that cost that one time is probably better than paying a much smaller cost but constantly.

Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Tom Lane
Дата: 23 февраля 2018 г., 02:24:50
Сообщение: Re: Allow workers to override datallowconn

Следующее

От: Andres Freund
Дата: 23 февраля 2018 г., 02:30:26
Сообщение: Re: Allow workers to override datallowconn

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Online enabling of checksums

Предыдущее

Следующее