Re: storing an explicit nonce

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: storing an explicit nonce
Дата
Msg-id 20210527151859.GE5646@momjian.us
обсуждение исходный текст
Ответ на Re: storing an explicit nonce  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: storing an explicit nonce  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Thu, May 27, 2021 at 10:47:13AM -0400, Robert Haas wrote:
> On Wed, May 26, 2021 at 4:40 PM Bruce Momjian <bruce@momjian.us> wrote:
> > You are saying that by using a non-LSN nonce, you can write out the page
> > with a new nonce, but the same LSN, and also discard the page during
> > crash recovery and use the WAL copy?
> 
> I don't know what "discard the page during crash recovery and use the
> WAL copy" means.

I was asking  how decoupling the nonce from the LSN allows for us to
avoid full page writes for hint bit changes.  I am guessing you are
saying that on recovery, if we see a hint-bit-only change in the WAL
(with a new nonce), we just throw away the page because it could be torn
and use the WAL full page write version.

> > I am confused why checksums, which are widely used, acceptably require
> > wal_log_hints, but there is concern that file encryption, which is
> > heavier, cannot acceptably require wal_log_hints.  I must be missing
> > something.
> 
> I explained this in the first complete paragraph of my first email
> with this subject line: "For example, right now, we only need to WAL
> log hints for the first write to each page after a checkpoint, but in
> this approach, if the same page is written multiple times per
> checkpoint cycle, we'd need to log hints every time." That's a huge
> difference. Page eviction in some workloads can push the same pages
> out of shared buffers every few seconds, whereas something that has to
> be done once per checkpoint cycle cannot affect each page nearly so
> often. A checkpoint is only going to occur every 5 minutes by default,
> or more realistically every 10-15 minutes in a well-tuned production
> system. In other words, we're not holding up some kind of double
> standard, where the existing feature is allowed to depend on doing a
> certain thing but your feature isn't allowed to depend on the same
> thing. Your design depends on doing something which is potentially
> 100x+ more expensive than the existing thing. It's not always going to
> be that expensive, but it can be.

Yes, it might be 1e100+++ more expensive too, but we don't know, and I
am not ready to add a lot of complexity for such an unknown.

> > Why can't checksums also throw away hint bit changes like you want to do
> > for file encryption and not require wal_log_hints?
> 
> Well, I don't want to throw away hint bit changes, just like we don't
> throw them away right now. And I want to do that by making sure that
> each time the page is written, we use a different nonce, but without
> the expense of having to advance the LSN.
> 
> Now, another option is to do what you suggest here. We could say that
> if a dirty page is evicted, but the page is only dirty because of
> hint-type changes, we don't actually write it out. That does avoid
> using the same nonce for multiple writes, because now there's only one
> write. It also fixes the problem on standbys that Andres was
> complaining about, because on a standby, the only way a page can
> possibly be dirtied without an associated WAL record is through a
> hint-type change. However, I think we'd find that this, too, is pretty
> expensive in certain workloads. It's useful to write hint bits -
> that's why we do it.

Oh, that does sound nice.  It is kind of an exit hatch if we are
evicting pages often for hint bit changes.  I like it.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: storing an explicit nonce
Следующее
От: vignesh C
Дата:
Сообщение: Re: Logical Replication - improve error message while adding tables to the publication in check_publication_add_relation