Обсуждение: Enable data checksums by default

Поиск
Список
Период
Сортировка

Enable data checksums by default

От
Greg Sabino Mullane
Дата:
Please find attached a patch to enable data checksums by default.

Currently, initdb only enables data checksums if passed the --data-checksums or -k argument. There was some hesitation years ago when this feature was first added, leading to the current situation where the default is off. However, many years later, there is wide consensus that this is an extraordinarily safe, desirable setting. Indeed, most (if not all) of the major commercial and open source Postgres systems currently turn this on by default. I posit you would be hard-pressed to find many systems these days in which it has NOT been turned on. So basically we have a de-facto standard, and I think it's time we flipped the switch to make it on by default.

The patch is simple enough: literally flipping the boolean inside of initdb.c, and adding a new argument '--no-data-checksums' for those instances that truly want the old behavior. One place that still needs the old behavior is our internal tests for pg_checksums and pg_amcheck, so I added a new argument to init() in PostgreSQL/Test/Cluster.pm to allow those to still pass their tests.

This is just the default - people are still more than welcome to turn it off with the new flag. The pg_checksums program is another option that actually argues for having the default "on", as switching to "off" once initdb has been run is trivial.

Yes, I am aware of the previous discussions on this, but the world moves fast - wal compression is better than in the past, vacuum is better now, and data-checksums being on is such a complete default in the wild, it feels weird and a disservice that we are not running all our tests like that.

Cheers,
Greg

Вложения

Re: Enable data checksums by default

От
Michael Banck
Дата:
Hi,

On Tue, Aug 06, 2024 at 06:46:52PM -0400, Greg Sabino Mullane wrote:
> Please find attached a patch to enable data checksums by default.
> 
> Currently, initdb only enables data checksums if passed the
> --data-checksums or -k argument. There was some hesitation years ago when
> this feature was first added, leading to the current situation where the
> default is off. However, many years later, there is wide consensus that
> this is an extraordinarily safe, desirable setting. Indeed, most (if not
> all) of the major commercial and open source Postgres systems currently
> turn this on by default. I posit you would be hard-pressed to find many
> systems these days in which it has NOT been turned on. So basically we have
> a de-facto standard, and I think it's time we flipped the switch to make it
> on by default.

[...]
 
> Yes, I am aware of the previous discussions on this, but the world moves
> fast - wal compression is better than in the past, vacuum is better now,
> and data-checksums being on is such a complete default in the wild, it
> feels weird and a disservice that we are not running all our tests like
> that.

I agree.

Some review on the patch:

> diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml
> index bdd613e77f..511f489d34 100644
> --- a/doc/src/sgml/ref/initdb.sgml
> +++ b/doc/src/sgml/ref/initdb.sgml
> @@ -267,12 +267,14 @@ PostgreSQL documentation
>         <para>
>          Use checksums on data pages to help detect corruption by the
>          I/O system that would otherwise be silent. Enabling checksums
> -        may incur a noticeable performance penalty. If set, checksums
> +        may incur a small performance penalty. If set, checksums
>          are calculated for all objects, in all databases. All checksum

I think the last time we dicussed this the consensus was that
computational overhead of computing the checksums is pretty small for
most systems (so the above change seems warranted regardless of whether
we switch the default), but turning on wal_compression also turns on
wal_log_hints, which can increase WAL by quite a lot. Maybe this is
covered elsewhere in the documentation (I just looked at the patch), but
if not, it probably should be added here as a word of caution.

>          failures will be reported in the
>          <link linkend="monitoring-pg-stat-database-view">
>          <structname>pg_stat_database</structname></link> view.
>          See <xref linkend="checksums" /> for details.
> +        As of version 18, checksums are enabled by default. They can be
> +        disabled by use of <option>--no-data-checksums</option>.

I think we usually do not mention when a feature was added/changed, do
we? So I'd just write "(default: enabled)" or whatever is the style of
the surrounding options.

>         </para>
>        </listitem>
>       </varlistentry>
> diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
> index f00718a015..ce7d3e99e5 100644
> --- a/src/bin/initdb/initdb.c
> +++ b/src/bin/initdb/initdb.c
> @@ -164,7 +164,7 @@ static bool noinstructions = false;
>  static bool do_sync = true;
>  static bool sync_only = false;
>  static bool show_setting = false;
> -static bool data_checksums = false;
> +static bool data_checksums = true;
>  static char *xlog_dir = NULL;
>  static int    wal_segment_size_mb = (DEFAULT_XLOG_SEG_SIZE) / (1024 * 1024);
>  static DataDirSyncMethod sync_method = DATA_DIR_SYNC_METHOD_FSYNC;
> @@ -3121,6 +3121,7 @@ main(int argc, char *argv[])
>          {"waldir", required_argument, NULL, 'X'},
>          {"wal-segsize", required_argument, NULL, 12},
>          {"data-checksums", no_argument, NULL, 'k'},
> +        {"no-data-checksums", no_argument, NULL, 20},

Does it make sense to add -K (capital k) as a short-cut for this? I
think this is how we distinguish on/off for pg_dump (-t/-T etc.) but
maybe that is not wider project policy.


Michael



Re: Enable data checksums by default

От
Greg Sabino Mullane
Дата:
On Wed, Aug 7, 2024 at 4:43 AM Michael Banck <mbanck@gmx.net> wrote:
I think the last time we dicussed this the consensus was that
computational overhead of computing the checksums is pretty small for
most systems (so the above change seems warranted regardless of whether
we switch the default), but turning on wal_compression also turns on
wal_log_hints, which can increase WAL by quite a lot. Maybe this is
covered elsewhere in the documentation (I just looked at the patch), but
if not, it probably should be added here as a word of caution.

Yeah, that seems something beyond this patch? Certainly we should mention wal_compression in the release notes if the default changes. I mean, I feel wal_log_hints should probably default to on as well, but I've honestly never really given it much thought because my fingers are trained to type "initdb -k". I've been using data checksums for roughly a decade now. I think the only time I've NOT used checksums was when I was doing checksum overhead measurements, or hacking on the pg_checksums program.
 
I think we usually do not mention when a feature was added/changed, do
we? So I'd just write "(default: enabled)" or whatever is the style of
the surrounding options.

+1
 
> +             {"no-data-checksums", no_argument, NULL, 20},

Does it make sense to add -K (capital k) as a short-cut for this? I
think this is how we distinguish on/off for pg_dump (-t/-T etc.) but
maybe that is not wider project policy.

I'd rather not. Better to keep it explicit rather than some other weird letter that has no mnemonic value.

Cheers,
Greg

Re: Enable data checksums by default

От
Peter Eisentraut
Дата:
On 07.08.24 00:46, Greg Sabino Mullane wrote:
> Currently, initdb only enables data checksums if passed the 
> --data-checksums or -k argument. There was some hesitation years ago 
> when this feature was first added, leading to the current situation 
> where the default is off. However, many years later, there is wide 
> consensus that this is an extraordinarily safe, desirable setting. 
> Indeed, most (if not all) of the major commercial and open source 
> Postgres systems currently turn this on by default. I posit you would be 
> hard-pressed to find many systems these days in which it has NOT been 
> turned on. So basically we have a de-facto standard, and I think it's 
> time we flipped the switch to make it on by default.

I'm sympathetic to this proposal, but I want to raise some concerns.

My understanding was that the reason for some hesitation about adopting 
data checksums was the performance impact.  Not the checksumming itself, 
but the overhead from hint bit logging.  The last time I looked into 
that, you could get performance impacts on the order of 5% tps.  Maybe 
that's acceptable, and you of course can turn it off if you want the 
extra performance.  But I think this should be discussed in this thread.

About the claim that it's already the de-facto standard.  Maybe that is 
approximately true for "serious" installations.  But AFAICT, the popular 
packagings don't enable checksums by default, so there is likely a 
significant middle tier between "just trying it out" and serious 
production use that don't have it turned on.

For those uses, this change would render pg_upgrade useless for upgrades 
from an old instance with default settings to a new instance with 
default settings.  And then users would either need to re-initdb with 
checksums turned back off, or I suppose run pg_checksums on the old 
instance before upgrading?  This is significant additional complication. 
  And packagers who have built abstractions on top of pg_upgrade (such 
as Debian pg_upgradecluster) would also need to implement something to 
manage this somehow.

So I think we need to think through the upgrade experience a bit more. 
Unfortunately, pg_checksums hasn't gotten to the point that we were 
perhaps once hoping for that you could enable checksums on a live 
system.  I'm thinking pg_upgrade could have a mode where it adds the 
checksum during the upgrade as it copies the files (essentially a subset 
of pg_checksums).  I think that would be useful for that middle tier of 
users who just want a good default experience.




Re: Enable data checksums by default

От
Michael Banck
Дата:
On Thu, Aug 08, 2024 at 12:11:38PM +0200, Peter Eisentraut wrote:
> So I think we need to think through the upgrade experience a bit more.
> Unfortunately, pg_checksums hasn't gotten to the point that we were perhaps
> once hoping for that you could enable checksums on a live system.  I'm
> thinking pg_upgrade could have a mode where it adds the checksum during the
> upgrade as it copies the files (essentially a subset of pg_checksums).  I
> think that would be useful for that middle tier of users who just want a
> good default experience.

Well that, or, as a first less ambitious step, pg_upgrade could carry
over the data_checksums setting from the old to the new instance by
essentially disabling it via pg_checksums -d (which is fast) if it the
current default (off) is set on the old instance and the new instance
was created with the new onw (checksums on).

Probably should include a warning or something in that case, though I
guess a lot of users will read just past it. But at least they are not
worse off than before.


Michael



Re: Enable data checksums by default

От
Daniel Gustafsson
Дата:
> On 8 Aug 2024, at 12:11, Peter Eisentraut <peter@eisentraut.org> wrote:

> My understanding was that the reason for some hesitation about adopting data checksums was the performance impact.
Notthe checksumming itself, but the overhead from hint bit logging.  The last time I looked into that, you could get
performanceimpacts on the order of 5% tps.  Maybe that's acceptable, and you of course can turn it off if you want the
extraperformance.  But I think this should be discussed in this thread. 

That's been my experience as well, the overhead of the checksumming is
negligible but the overhead in WAL can be (having hint bits WAL logged does
carry other benefits as well to be fair).

> I think we need to think through the upgrade experience a bit more.

+1

> Unfortunately, pg_checksums hasn't gotten to the point that we were perhaps once hoping for that you could enable
checksumson a live system.   

I don't recall there being any work done (or plans for) using pg_checksums on a
live system.  Anyone interested in enabling checksums on a live cluster can
however review the patch for that in:

  https://postgr.es/m/E07A611B-9CF3-4FDB-8CE8-A221E39040EC@yesql.se

> I'm thinking pg_upgrade could have a mode where it adds the checksum during the upgrade as it copies the files
(essentiallya subset of pg_checksums).  I think that would be useful for that middle tier of users who just want a good
defaultexperience. 

As a side-note, I implemented this in pg_upgrade at Greenplum (IIRC it was
submitted to -hackers at the time as well) and it worked well with not a lot of
code.

--
Daniel Gustafsson




Re: Enable data checksums by default

От
Greg Sabino Mullane
Дата:
Thank you for the feedback. Please find attached three separate patches. One to add a new flag to initdb (--no-data-checksums), one to adjust the tests to use this flag as needed, and the final to make the actual switch of the default value (along with tests and docs).

Cheers,
Greg


Вложения

Re: Enable data checksums by default

От
Robert Haas
Дата:
On Thu, Aug 8, 2024 at 6:11 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> About the claim that it's already the de-facto standard.  Maybe that is
> approximately true for "serious" installations.  But AFAICT, the popular
> packagings don't enable checksums by default, so there is likely a
> significant middle tier between "just trying it out" and serious
> production use that don't have it turned on.

+1.

> I'm thinking pg_upgrade could have a mode where it adds the
> checksum during the upgrade as it copies the files (essentially a subset
> of pg_checksums).  I think that would be useful for that middle tier of
> users who just want a good default experience.

That would be very nice.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Enable data checksums by default

От
Tomas Vondra
Дата:
On 8/8/24 19:42, Robert Haas wrote:
> On Thu, Aug 8, 2024 at 6:11 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>> About the claim that it's already the de-facto standard.  Maybe that is
>> approximately true for "serious" installations.  But AFAICT, the popular
>> packagings don't enable checksums by default, so there is likely a
>> significant middle tier between "just trying it out" and serious
>> production use that don't have it turned on.
> 
> +1.
> 
>> I'm thinking pg_upgrade could have a mode where it adds the
>> checksum during the upgrade as it copies the files (essentially a subset
>> of pg_checksums).  I think that would be useful for that middle tier of
>> users who just want a good default experience.
> 
> That would be very nice.
> 

Yeah, but it might also disable checksums on the new cluster, which
would work for link mode too. So we'd probably want multiple modes, one
to enable checksums during file copy, one to disable checksums, and one
to just fail for incompatible clusters.


-- 
Tomas Vondra



Re: Enable data checksums by default

От
Greg Sabino Mullane
Дата:
On Thu, Aug 8, 2024 at 6:11 AM Peter Eisentraut <peter@eisentraut.org> wrote:
 
My understanding was that the reason for some hesitation about adopting data checksums was the performance impact.  Not the checksumming itself, but the overhead from hint bit logging.  The last time I looked into that, you could get performance impacts on the order of 5% tps.  Maybe that's acceptable, and you of course can turn it off if you want the extra performance.  But I think this should be discussed in this thread.

Fair enough. I think the performance impact is acceptable, as evidenced by the large number of people that turn it on. And it is easy enough to turn it off again, either via --no-data-checksums or pg_checksums --disable. I've come across people who have regretted not throwing a -k into their initial initdb, but have not yet come across someone who has the opposite regret. When I did some measurements some time ago, I found numbers much less than 5%, but of course it depends on a lot of factors.

About the claim that it's already the de-facto standard.  Maybe that is approximately true for "serious" installations.  But AFAICT, the popular packagings don't enable checksums by default, so there is likely a significant middle tier between "just trying it out" and serious
production use that don't have it turned on.

I would push back on that "significant" a good bit. The number of Postgres installations in the cloud is very likely to dwarf the total package installations. Maybe not 10 years ago, but now? Maybe someone from Amazon can share some numbers. Not that we have any way to compare against package installs :) But anecdotally the number of people who mention RDS etc. on the various fora has exploded.
 
For those uses, this change would render pg_upgrade useless for upgrades from an old instance with default settings to a new instance with default settings.  And then users would either need to re-initdb with checksums turned back off, or I suppose run pg_checksums on the old instance before upgrading?  This is significant additional complication.

Meh, re-running initdb with --no-data-checksums seems a fairly low hurdle.
 
And packagers who have built abstractions on top of pg_upgrade (such as Debian pg_upgradecluster) would also need to implement something to manage this somehow.

How does it deal with clusters with checksums enabled now?
 
I'm thinking pg_upgrade could have a mode where it adds the checksum during the upgrade as it copies the files (essentially a subset
of pg_checksums).  I think that would be useful for that middle tier of users who just want a good default experience.

Hm...might be a bad experience if it forces a switch out of --link mode. Perhaps a warning at the end of pg_upgrade that suggests running pg_checksums on your new cluster if you want to enable checksums?

Cheers,
Greg
 

Re: Enable data checksums by default

От
Robert Haas
Дата:
On Tue, Aug 13, 2024 at 10:42 AM Greg Sabino Mullane <htamfids@gmail.com> wrote:
> Fair enough. I think the performance impact is acceptable, as evidenced by the large number of people that turn it
on.And it is easy enough to turn it off again, either via --no-data-checksums or pg_checksums --disable. 
> When I did some measurements some time ago, I found numbers much less than 5%, but of course it depends on a lot of
factors.

I think the bad case is when you have a write workload that is
significantly bigger than shared_buffers but still small enough to fit
comfortably in the OS cache. When everything fits in shared_buffers,
you only need to write dirty buffers once per checkpoint cycle, so
making it more expensive isn't necessarily a big deal. When you're
constantly going to disk, that's so expensive that you don't notice
the computational overhead. But when you're in that middle zone where
you keep evicting buffers from PG but not actually having to write
them down to the disk, then I think it's pretty noticeable.

> I've come across people who have regretted not throwing a -k into their initial initdb, but have not yet come across
someonewho has the opposite regret. 

I don't think this is really a fair comparison, because everything
being a little slower all the time is not something that people are
likely to "regret" in the same way that they regret it when a data
corruption issue goes undetected. An undetected data corruption issue
is a single, very painful event that people are likely to notice,
whereas a small performance loss over time kind of blends into the
background. You don't really regret that kind of thing in the same way
that you regret a bad event that happens at a particular moment in
time.

And it's not like we have statistics anywhere that you can look at to
see how much CPU time you spent computing checksums, so if a user DOES
have a performance problem that would not have occurred if checksums
had been disabled, they'll probably never know it.

>> For those uses, this change would render pg_upgrade useless for upgrades from an old instance with default settings
toa new instance with default settings.  And then users would either need to re-initdb with checksums turned back off,
orI suppose run pg_checksums on the old instance before upgrading?  This is significant additional complication. 
> Meh, re-running initdb with --no-data-checksums seems a fairly low hurdle.

I tend to agree with that, but I would also like to see the sort of
improvements that Peter mentions. It's a lot less work to say "let's
just change the default" and then get mad at anyone who disagrees than
it is to do the engineering to make changing the default less of a
problem. But that kind of engineering really adds a lot of value
compared to just changing the default.

None of that is to say that I'm totally hostile to this change.
Checksums don't actually prevent your data from getting corrupted, or
let you recover it after it does. They just tell you about the
problem, and very often you would have found out anyway. However, they
do have peace-of-mind value. If you've got checksums turned on, you
can verify your checksums regularly and see that they're OK, and
people like that. Whether that's worth the overhead for everyone, I'm
not quite sure.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Enable data checksums by default

От
Jakub Wartak
Дата:
On Wed, Aug 7, 2024 at 4:18 PM Greg Sabino Mullane <htamfids@gmail.com> wrote:
>
> On Wed, Aug 7, 2024 at 4:43 AM Michael Banck <mbanck@gmx.net> wrote:
>>
>> I think the last time we dicussed this the consensus was that
>> computational overhead of computing the checksums is pretty small for
>> most systems (so the above change seems warranted regardless of whether
>> we switch the default), but turning on wal_compression also turns on
>> wal_log_hints, which can increase WAL by quite a lot. Maybe this is
[..]
>
>
> Yeah, that seems something beyond this patch? Certainly we should mention wal_compression in the release notes if the
defaultchanges. I mean, I feel wal_log_hints should probably default to on as well, but I've honestly never really
givenit much thought because my fingers are trained to type "initdb -k". I've been using data checksums for roughly a
decadenow. I think the only time I've NOT used checksums was when I was doing checksum overhead measurements, or
hackingon the pg_checksums program. 

Maybe I don't understand something, but just to be clear:
wal_compression (mentioned above) is not turning wal_log_hints on,
just the wal_log_hints needs to be on when using data checksums
(implicitly, by the XLogHintBitIsNeeded() macro). I suppose Michael
was thinking about the wal_log_hints earlier (?)

-J.



Re: Enable data checksums by default

От
Jakub Wartak
Дата:
Hi Greg and others

On Tue, Aug 13, 2024 at 4:42 PM Greg Sabino Mullane <htamfids@gmail.com> wrote:
>
> On Thu, Aug 8, 2024 at 6:11 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>
>>
>> My understanding was that the reason for some hesitation about adopting data checksums was the performance impact.
Notthe checksumming itself, but the overhead from hint bit logging.  The last time I looked into that, you could get
performanceimpacts on the order of 5% tps.  Maybe that's acceptable, and you of course can turn it off if you want the
extraperformance.  But I think this should be discussed in this thread. 
>
>
> Fair enough. I think the performance impact is acceptable, as evidenced by the large number of people that turn it
on.And it is easy enough to turn it off again, either via --no-data-checksums or pg_checksums --disable. I've come
acrosspeople who have regretted not throwing a -k into their initial initdb, but have not yet come across someone who
hasthe opposite regret. When I did some measurements some time ago, I found numbers much less than 5%, but of course it
dependson a lot of factors. 

Same here, and +1 to data_checksums=on by default for new installations.

The best public measurement of the impact was posted in [1] in 2019 by
Tomas to the best of my knowledge, where he explicitly mentioned the
problem with more WAL with hints/checksums: SATA disks (low IOPS). My
take: now we have 2024, and most people are using at least SSDs or
slow-SATA (but in cloud they could just change the class of I/O if
required to get IOPS to avoid too much throttling), therefore the
price of IOPS dropped significantly.

>> About the claim that it's already the de-facto standard.  Maybe that is approximately true for "serious"
installations. But AFAICT, the popular packagings don't enable checksums by default, so there is likely a significant
middletier between "just trying it out" and serious 
>> production use that don't have it turned on.
>
>
> I would push back on that "significant" a good bit. The number of Postgres installations in the cloud is very likely
todwarf the total package installations. Maybe not 10 years ago, but now? Maybe someone from Amazon can share some
numbers.Not that we have any way to compare against package installs :) But anecdotally the number of people who
mentionRDS etc. on the various fora has exploded. 

Same here. If it helps the case the: 43% of all PostgreSQL DBs
involved in any support case or incident in EDB within last year had
data_checksums=on (at least if they had collected the data using our )
. That's a surprisingly high number (for something that's off by
default), and it makes me think this is because plenty of customers
are either managed by DBAs who care, or assisted by consultants when
deploying, or simply using TPAexec [2] which has this on by default.

Another thing is plenty of people run with wal_log_hints=on (without
data_checksums=off) just to have pg_rewind working. As this is a
strictly standby related tool it means they don't have WAL/network
bandwidth problems, so the WAL rate is not that high in the wild to
cause problems. I found 1 or 2 cases within last year where we would
mention that high WAL generation was attributed to
wal_log_hints=on/XLOG_FPI  and they still didn't disable it apparently
(we have plenty of cases related to too much WAL, but it's mostly due
to other basic reasons)

-J.

[1] - https://www.postgresql.org/message-id/20190330192543.GH4719%40development
[2] - https://www.enterprisedb.com/docs/pgd/4/deployments/tpaexec/



Re: Enable data checksums by default

От
Michael Banck
Дата:
On Thu, Aug 15, 2024 at 09:49:04AM +0200, Jakub Wartak wrote:
> On Wed, Aug 7, 2024 at 4:18 PM Greg Sabino Mullane <htamfids@gmail.com> wrote:
> >
> > On Wed, Aug 7, 2024 at 4:43 AM Michael Banck <mbanck@gmx.net> wrote:
> >>
> >> I think the last time we dicussed this the consensus was that
> >> computational overhead of computing the checksums is pretty small for
> >> most systems (so the above change seems warranted regardless of whether
> >> we switch the default), but turning on wal_compression also turns on
> >> wal_log_hints, which can increase WAL by quite a lot. Maybe this is
> [..]
> >
> >
> > Yeah, that seems something beyond this patch? Certainly we should
> > mention wal_compression in the release notes if the default changes.
> > I mean, I feel wal_log_hints should probably default to on as well,
> > but I've honestly never really given it much thought because my
> > fingers are trained to type "initdb -k". I've been using data
> > checksums for roughly a decade now. I think the only time I've NOT
> > used checksums was when I was doing checksum overhead measurements,
> > or hacking on the pg_checksums program.
> 
> Maybe I don't understand something, but just to be clear:
> wal_compression (mentioned above) is not turning wal_log_hints on,
> just the wal_log_hints needs to be on when using data checksums
> (implicitly, by the XLogHintBitIsNeeded() macro). I suppose Michael
> was thinking about the wal_log_hints earlier (?)

Uh, I am pretty sure I meant to say "turning on data_checksums als turns
on wal_log_hints", sorry about the confusion.

I guess the connection is that if you turn on wal_lot_hints (either
directly or via data_checksums) then the number FPIs goes up (possibly
signficantly), and enabling wal_compression could (partly) remedy that.
But I agree with Greg that such a discussion is probably out-of-scope
for this default change.


Michael



Re: Enable data checksums by default

От
Jakub Wartak
Дата:
Hi all,

On Tue, Aug 13, 2024 at 10:08 PM Robert Haas <robertmhaas@gmail.com> wrote:

> And it's not like we have statistics anywhere that you can look at to
> see how much CPU time you spent computing checksums, so if a user DOES
> have a performance problem that would not have occurred if checksums
> had been disabled, they'll probably never know it.

In worst case, per second and per-pid CPU time consumption could be
quantified by having eBPF which is the standard on distros now
(requires kernel headers and bpfcc-tools installed), e.g. here 7918
was PID doing pgbench-related -c 4 workload with checksum=on (sorry
for formatting, but I don't want to use HTML here):

# funclatency-bpfcc --microseconds -i 1 -p 7918
/usr/lib/postgresql/16/bin/postgres:pg_checksum_page
Tracing 1 functions for
"/usr/lib/postgresql/16/bin/postgres:pg_checksum_page"... Hit Ctrl-C
to end.

     usecs               : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 238      |*************                           |
         4 -> 7          : 714      |****************************************|
         8 -> 15         : 2        |                                        |
        16 -> 31         : 5        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 1        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 1        |                                        |
       512 -> 1023       : 1        |                                        |

avg = 6 usecs, total: 6617 usecs, count: 962


     usecs               : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 241      |*************                           |
         4 -> 7          : 706      |****************************************|
         8 -> 15         : 11       |                                        |
        16 -> 31         : 10       |                                        |
        32 -> 63         : 1        |                                        |

avg = 5 usecs, total: 5639 usecs, count: 969

[..refreshes every 1s here..]

So the above can tell us e.g. that this pg_checksum_page() took 5639
us out of 1s full sample time (and with 100% CPU pegged core so that's
gives again ~5% CPU util per this routine; I'm ignoring the WAL/log
hint impact for sure). One could also write a small script using
bpftrace instead, too. Disassembly on Debian version and stock PGDG is
telling me it's ful SSE2 instruction-set, so that's nice and optimal
too.

> >> For those uses, this change would render pg_upgrade useless for upgrades from an old instance with default
settingsto a new instance with default settings.  And then users would either need to re-initdb with checksums turned
backoff, or I suppose run pg_checksums on the old instance before upgrading?  This is significant additional
complication.
> > Meh, re-running initdb with --no-data-checksums seems a fairly low hurdle.
>
> I tend to agree with that, but I would also like to see the sort of
> improvements that Peter mentions.
[..]
> None of that is to say that I'm totally hostile to this change.
[.,.]
> Whether that's worth the overhead for everyone, I'm not quite sure.

Without data checksums there's a risk that someone receives silent-bit
corruption and no one will notice. Shouldn't integrity of data stand
above performance by default, in this case? (and performance could be
opt-in, if someone really wants it).

-J.



Re: Enable data checksums by default

От
Jakub Wartak
Дата:
On Thu, Aug 22, 2024 at 8:11 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 15.08.24 08:38, Peter Eisentraut wrote:
> > On 08.08.24 19:42, Robert Haas wrote:
> >>> I'm thinking pg_upgrade could have a mode where it adds the
> >>> checksum during the upgrade as it copies the files (essentially a subset
> >>> of pg_checksums).  I think that would be useful for that middle tier of
> >>> users who just want a good default experience.
> >> That would be very nice.
> >
> > Here is a demo patch for that.  It turned out to be quite simple.
> >
> > I wrote above about a separate mode for that (like
> > --copy-and-make-adjustments), but it was just as easy to stick it into
> > the existing --copy mode.
> >
> > It would be useful to check what the performance overhead of this is
> > versus a copy that does not have to make adjustments.  I expect it's
> > very little.
> >
> > A drawback is that as written this does not work on Windows, because
> > Windows uses a different code path in copyFile().  I don't know the
> > reasons for that.  But it would need to be figured out.
>
> Here is an updated patch for this.  I simplified the logic a bit and
> also handle the case where the read() reads less than a round number of
> blocks.  I did some performance testing.  The overhead of computing the
> checksums versus a straight --copy without checksum adjustments appears
> to be around 5% wall clock time, which seems ok to me.  I also looked
> around the documentation to see if there is anything to update, but
> didn't find anything.
>
> I think if we can work out what to do on Windows, this could be a useful
> little feature for facilitating $subject.

My take:
1. I wonder if we should or should not by default calculate/enable the
checksums when doing pg_upgrade --copy from cluster with
checksums=off. Maybe we should error on that like we are doing now.
There might be still people want to have them off, but they would use
the proposed-new-defaults-of-initdb with checksums on blindly (so this
should be opt-in via some switch like with let's say
--copy-and-enable-checksums; so the user is in full control).
2. WIN32's copyFile() could then stay as it is, and then that new
--copy-and-enable-checksums on WIN32 would have to fallback to classic
loop.

-J.



Re: Enable data checksums by default

От
Peter Eisentraut
Дата:
On 08.08.24 19:19, Greg Sabino Mullane wrote:
> Thank you for the feedback. Please find attached three separate patches. 
> One to add a new flag to initdb (--no-data-checksums), one to adjust the 
> tests to use this flag as needed, and the final to make the actual 
> switch of the default value (along with tests and docs).

I think we can get started with the initdb --no-data-checksums option.

The 0001 patch is missing documentation and --help output for this 
option.  Also, some of the tests for the option that are in patch 0003 
might be better in patch 0001.

Separately, this

-        may incur a noticeable performance penalty. If set, checksums
+        may incur a small performance penalty. If set, checksums

should perhaps be committed separately.  I don't think the patch 0003 
really changes the performance penalty. ;-)




Re: Enable data checksums by default

От
Bruce Momjian
Дата:
On Fri, Aug 23, 2024 at 03:17:17PM +0200, Peter Eisentraut wrote:
> On 08.08.24 19:19, Greg Sabino Mullane wrote:
> > Thank you for the feedback. Please find attached three separate patches.
> > One to add a new flag to initdb (--no-data-checksums), one to adjust the
> > tests to use this flag as needed, and the final to make the actual
> > switch of the default value (along with tests and docs).
> 
> I think we can get started with the initdb --no-data-checksums option.
> 
> The 0001 patch is missing documentation and --help output for this option.
> Also, some of the tests for the option that are in patch 0003 might be
> better in patch 0001.
> 
> Separately, this
> 
> -        may incur a noticeable performance penalty. If set, checksums
> +        may incur a small performance penalty. If set, checksums
> 
> should perhaps be committed separately.  I don't think the patch 0003 really
> changes the performance penalty. ;-)

I think "might" would be more precise than "may" above.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.



Re: Enable data checksums by default

От
Nathan Bossart
Дата:
In general, +1 for $SUBJECT.

     printf(_("  -k, --data-checksums      use data page checksums\n"));
+    printf(_("      --no-data-checksums   do not use data page checksums\n"));

Should we error if both --data-checksum and --no-data-checksums are
specified?  IIUC with 0001, we'll use whichever is specified last.

nitpick: these 4 patches are small enough that they could likely be
combined and committed together.

I think it's fair to say we should make the pg_upgrade experience nicer
once the default changes, but IMHO that needn't block actually changing the
default.

-- 
nathan



Re: Enable data checksums by default

От
Greg Sabino Mullane
Дата:
On Mon, Aug 26, 2024 at 3:46 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
Should we error if both --data-checksum and --no-data-checksums are
specified?  IIUC with 0001, we'll use whichever is specified last.

Hmmm, that is a good question. We have never (to my recollection) flipped a default quite like this before. I'm inclined to leave it as "last one wins", as I can see automated systems appending their desired selection to the end of the arg list, and expecting it to work.

nitpick: these 4 patches are small enough that they could likely be combined and committed together.

This was split per request upthread, which I do agree with.

I think it's fair to say we should make the pg_upgrade experience nicer once the default changes, but IMHO that needn't block actually changing the default.

+1

Cheers,
Greg


 

Re: Enable data checksums by default

От
Peter Eisentraut
Дата:
On 27.08.24 15:44, Greg Sabino Mullane wrote:
> On Mon, Aug 26, 2024 at 3:46 PM Nathan Bossart <nathandbossart@gmail.com 
> <mailto:nathandbossart@gmail.com>> wrote:
> 
>     Should we error if both --data-checksum and --no-data-checksums are
>     specified?  IIUC with 0001, we'll use whichever is specified last.
> 
> 
> Hmmm, that is a good question. We have never (to my recollection) 
> flipped a default quite like this before. I'm inclined to leave it as 
> "last one wins", as I can see automated systems appending their desired 
> selection to the end of the arg list, and expecting it to work.

Yes, last option wins is the normal expected behavior.




Re: Enable data checksums by default

От
Nathan Bossart
Дата:
On Tue, Aug 27, 2024 at 05:16:51PM +0200, Peter Eisentraut wrote:
> On 27.08.24 15:44, Greg Sabino Mullane wrote:
>> On Mon, Aug 26, 2024 at 3:46 PM Nathan Bossart <nathandbossart@gmail.com
>> <mailto:nathandbossart@gmail.com>> wrote:
>> 
>>     Should we error if both --data-checksum and --no-data-checksums are
>>     specified?  IIUC with 0001, we'll use whichever is specified last.
>> 
>> 
>> Hmmm, that is a good question. We have never (to my recollection)
>> flipped a default quite like this before. I'm inclined to leave it as
>> "last one wins", as I can see automated systems appending their desired
>> selection to the end of the arg list, and expecting it to work.
> 
> Yes, last option wins is the normal expected behavior.

WFM

001_verify_heapam fails with this patch set.  I think you may need to use
--no-data-checksums in that test, too.  Otherwise, it looks pretty good to
me.

-- 
nathan



Re: Enable data checksums by default

От
Peter Eisentraut
Дата:
On 27.08.24 17:26, Nathan Bossart wrote:
> On Tue, Aug 27, 2024 at 05:16:51PM +0200, Peter Eisentraut wrote:
>> On 27.08.24 15:44, Greg Sabino Mullane wrote:
>>> On Mon, Aug 26, 2024 at 3:46 PM Nathan Bossart <nathandbossart@gmail.com
>>> <mailto:nathandbossart@gmail.com>> wrote:
>>>
>>>      Should we error if both --data-checksum and --no-data-checksums are
>>>      specified?  IIUC with 0001, we'll use whichever is specified last.
>>>
>>>
>>> Hmmm, that is a good question. We have never (to my recollection)
>>> flipped a default quite like this before. I'm inclined to leave it as
>>> "last one wins", as I can see automated systems appending their desired
>>> selection to the end of the arg list, and expecting it to work.
>>
>> Yes, last option wins is the normal expected behavior.
> 
> WFM
> 
> 001_verify_heapam fails with this patch set.  I think you may need to use
> --no-data-checksums in that test, too.  Otherwise, it looks pretty good to
> me.

I have committed 0001 (the new option) and 0004 (the docs tweak).  I 
think there is consensus for the rest, too, but I'll leave it for a few 
more days to think about.  I guess the test failure has to be addressed.




Re: Enable data checksums by default

От
Daniel Gustafsson
Дата:
> On 3 Oct 2024, at 23:13, Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Tue, Oct 01, 2024 at 11:15:02AM -0400, Peter Eisentraut wrote:
>> I have committed 0001 (the new option) and 0004 (the docs tweak).  I think
>> there is consensus for the rest, too, but I'll leave it for a few more days
>> to think about.  I guess the test failure has to be addressed.
>
> Here is a rebased patch with the test fix (for cfbot).  I have made no
> other changes.

+        Data checksums are enabled by default. They can be
+        disabled by use of <option>--no-data-checksums</option>.

Nitpick, but I think this should be an xref like how we link to --no-locale in
the -E docs: <xref linkend="app-initdb-no-data-checksums"/> instead.

LGTM otherwise.

--
Daniel Gustafsson




Re: Enable data checksums by default

От
Peter Eisentraut
Дата:
On 03.10.24 23:13, Nathan Bossart wrote:
> On Tue, Oct 01, 2024 at 11:15:02AM -0400, Peter Eisentraut wrote:
>> I have committed 0001 (the new option) and 0004 (the docs tweak).  I think
>> there is consensus for the rest, too, but I'll leave it for a few more days
>> to think about.  I guess the test failure has to be addressed.
> 
> Here is a rebased patch with the test fix (for cfbot).  I have made no
> other changes.

I have committed the test changes (patch 0002).  (I renamed the option 
to no_data_checksums to keep the wording consistent with the initdb option.)

Right now, with checksums off by default, this doesn't do much, but you 
can test this like

     PG_TEST_INITDB_EXTRA_OPTS=--data-checksums meson test ...

and everything will pass.  To make that work, I had to adjust the order 
of how the initdb options are assembled in Cluster.pm a bit.

I will work on the patch that flips the default next.




Re: Enable data checksums by default

От
Peter Eisentraut
Дата:
On 14.10.24 11:28, Peter Eisentraut wrote:
> On 03.10.24 23:13, Nathan Bossart wrote:
>> On Tue, Oct 01, 2024 at 11:15:02AM -0400, Peter Eisentraut wrote:
>>> I have committed 0001 (the new option) and 0004 (the docs tweak).  I 
>>> think
>>> there is consensus for the rest, too, but I'll leave it for a few 
>>> more days
>>> to think about.  I guess the test failure has to be addressed.
>>
>> Here is a rebased patch with the test fix (for cfbot).  I have made no
>> other changes.
> 
> I have committed the test changes (patch 0002).  (I renamed the option 
> to no_data_checksums to keep the wording consistent with the initdb 
> option.)
> 
> Right now, with checksums off by default, this doesn't do much, but you 
> can test this like
> 
>      PG_TEST_INITDB_EXTRA_OPTS=--data-checksums meson test ...
> 
> and everything will pass.  To make that work, I had to adjust the order 
> of how the initdb options are assembled in Cluster.pm a bit.
> 
> I will work on the patch that flips the default next.

The patch that flips the default has been committed.

I also started a PG18 open items page and made a note that we follow up 
on the upgrade experience, as was discussed in this thread.

https://wiki.postgresql.org/wiki/PostgreSQL_18_Open_Items



Re: Enable data checksums by default

От
Peter Eisentraut
Дата:
On 16.10.24 08:54, Peter Eisentraut wrote:
> On 14.10.24 11:28, Peter Eisentraut wrote:
>> On 03.10.24 23:13, Nathan Bossart wrote:
>>> On Tue, Oct 01, 2024 at 11:15:02AM -0400, Peter Eisentraut wrote:
>>>> I have committed 0001 (the new option) and 0004 (the docs tweak).  I 
>>>> think
>>>> there is consensus for the rest, too, but I'll leave it for a few 
>>>> more days
>>>> to think about.  I guess the test failure has to be addressed.
>>>
>>> Here is a rebased patch with the test fix (for cfbot).  I have made no
>>> other changes.
>>
>> I have committed the test changes (patch 0002).  (I renamed the option 
>> to no_data_checksums to keep the wording consistent with the initdb 
>> option.)
>>
>> Right now, with checksums off by default, this doesn't do much, but 
>> you can test this like
>>
>>      PG_TEST_INITDB_EXTRA_OPTS=--data-checksums meson test ...
>>
>> and everything will pass.  To make that work, I had to adjust the 
>> order of how the initdb options are assembled in Cluster.pm a bit.
>>
>> I will work on the patch that flips the default next.
> 
> The patch that flips the default has been committed.
> 
> I also started a PG18 open items page and made a note that we follow up 
> on the upgrade experience, as was discussed in this thread.

Ah yes, and the upgrade tests on the buildfarm don't like this.  What 
shall we do about this?  Maybe adjust the buildfarm scripts to use 
--no-data-checksums?




Re: Enable data checksums by default

От
Daniel Gustafsson
Дата:
> On 16 Oct 2024, at 09:49, Peter Eisentraut <peter@eisentraut.org> wrote:

> Ah yes, and the upgrade tests on the buildfarm don't like this.  What shall we do about this?  Maybe adjust the
buildfarmscripts to use --no-data-checksums? 

I can't see many other options, and it represents a reasonable use-case, so +1.

--
Daniel Gustafsson




Re: Enable data checksums by default

От
Peter Eisentraut
Дата:
On 07.11.24 19:38, Alvaro Herrera wrote:
> I have just noticed that since this patch was committed as 04bec894a04c,
> pg_upgrade's "make check" action is unusable when given the
> "olddump/oldinstall" options.  We now need to inject '-k' to the initdb
> line for old servers, and we don't, so all upgrade tests fail.  I think
> this patch should be enough to fix it.

Yes, this fix looks correct.  (Or the other way around: Disable 
checksums on the new node.)

Is this not being covered by the build farm?  Are the upgrade tests 
there not using this?




Re: Enable data checksums by default

От
Alvaro Herrera
Дата:
On 2024-Nov-13, Peter Eisentraut wrote:

> On 07.11.24 19:38, Alvaro Herrera wrote:
> > I have just noticed that since this patch was committed as 04bec894a04c,
> > pg_upgrade's "make check" action is unusable when given the
> > "olddump/oldinstall" options.  We now need to inject '-k' to the initdb
> > line for old servers, and we don't, so all upgrade tests fail.  I think
> > this patch should be enough to fix it.
> 
> Yes, this fix looks correct.

Thanks, pushed.

> (Or the other way around: Disable checksums on the new node.)

Yeah, I thought about that too, but, I think it'd be less realistic,
because the world is become one where checksums are enabled, not the
other way around.

> Is this not being covered by the build farm?  Are the upgrade tests there
> not using this?

Nope, the buildfarm has separate code to test cross-version upgrades,
https://github.com/PGBuildFarm/client-code/blob/main/PGBuild/Modules/TestUpgradeXversion.pm
This predates our in-core test support.  Maybe buildfarm's could be
simplified, but I'm not sure if it's worth it.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"El sentido de las cosas no viene de las cosas, sino de
las inteligencias que las aplican a sus problemas diarios
en busca del progreso." (Ernesto Hernández-Novich)



Re: Enable data checksums by default

От
Tomas Vondra
Дата:
On 10/16/24 08:54, Peter Eisentraut wrote:
> On 14.10.24 11:28, Peter Eisentraut wrote:
>> On 03.10.24 23:13, Nathan Bossart wrote:
>>> On Tue, Oct 01, 2024 at 11:15:02AM -0400, Peter Eisentraut wrote:
>>>> I have committed 0001 (the new option) and 0004 (the docs tweak).  I
>>>> think
>>>> there is consensus for the rest, too, but I'll leave it for a few
>>>> more days
>>>> to think about.  I guess the test failure has to be addressed.
>>>
>>> Here is a rebased patch with the test fix (for cfbot).  I have made no
>>> other changes.
>>
>> I have committed the test changes (patch 0002).  (I renamed the option
>> to no_data_checksums to keep the wording consistent with the initdb
>> option.)
>>
>> Right now, with checksums off by default, this doesn't do much, but
>> you can test this like
>>
>>      PG_TEST_INITDB_EXTRA_OPTS=--data-checksums meson test ...
>>
>> and everything will pass.  To make that work, I had to adjust the
>> order of how the initdb options are assembled in Cluster.pm a bit.
>>
>> I will work on the patch that flips the default next.
> 
> The patch that flips the default has been committed.
> 
> I also started a PG18 open items page and made a note that we follow up
> on the upgrade experience, as was discussed in this thread.
> 
> https://wiki.postgresql.org/wiki/PostgreSQL_18_Open_Items
> 

Regarding the open item, can someone explain what exactly are we
planning to evaluate mid-beta?

We went through the open items on the RMT team meeting today, and my
impression was the questions are mostly about performance of having
checksums by default, but now I realize the thread talks about "upgrade
experience" which seems fairly wide.

So, what kind of data we expect to gather in order to evaluate this?
Who's expected to collect it and evaluate this?

regards

-- 
Tomas Vondra




Re: Enable data checksums by default

От
Christoph Berg
Дата:
Re: Tomas Vondra
> We went through the open items on the RMT team meeting today, and my
> impression was the questions are mostly about performance of having
> checksums by default, but now I realize the thread talks about "upgrade
> experience" which seems fairly wide.

Fwiw, Debian's pg_upgradecluster already knows about that change. With
the default "dump" upgrade method, clusters will be transitioned to
checksums enabled, while "upgrade" (= pg_upgrade) upgrades will stick
with whatever was there before.

Christoph



Re: Enable data checksums by default

От
Peter Eisentraut
Дата:
On 23.04.25 00:24, Tomas Vondra wrote:
>> The patch that flips the default has been committed.
>>
>> I also started a PG18 open items page and made a note that we follow up
>> on the upgrade experience, as was discussed in this thread.
>>
>> https://wiki.postgresql.org/wiki/PostgreSQL_18_Open_Items
>>
> Regarding the open item, can someone explain what exactly are we
> planning to evaluate mid-beta?

If you have a PG <=17 installation without checksums (the default), and 
you do the usual upgrade procedure to PG 18 involving initdb + 
pg_upgrade, then pg_upgrade will reject the upgrade, because the 
checksum settings don't match.  The workaround is to run initdb with 
--no-data-checksums and try again.

That's probably not all that bad, but if this is all below a bunch of 
layers of scripts, users will have to do some work on their end to get 
this working smoothly.

The point of the open item was (a) to make sure this is adequately 
documented, for instance in the release notes, (b) to think about 
technological solutions to simplify this, such as [0], and (c) to just 
check the general feedback.

Nothing from [0] ended up being committed, so that part of obsolete. 
The action for beta1 is (a).  And then for (c) perhaps monitor the 
feedback between beta1 and beta2.


[0]: 
https://www.postgresql.org/message-id/flat/57957aca-3eae-4106-afb2-3008122b9950%40eisentraut.org




Re: Enable data checksums by default

От
Heikki Linnakangas
Дата:
On 24/04/2025 14:26, Peter Eisentraut wrote:
> On 23.04.25 00:24, Tomas Vondra wrote:
>>> The patch that flips the default has been committed.
>>>
>>> I also started a PG18 open items page and made a note that we follow up
>>> on the upgrade experience, as was discussed in this thread.
>>>
>>> https://wiki.postgresql.org/wiki/PostgreSQL_18_Open_Items
>>>
>> Regarding the open item, can someone explain what exactly are we
>> planning to evaluate mid-beta?
> 
> If you have a PG <=17 installation without checksums (the default), and 
> you do the usual upgrade procedure to PG 18 involving initdb + 
> pg_upgrade, then pg_upgrade will reject the upgrade, because the 
> checksum settings don't match.  The workaround is to run initdb with -- 
> no-data-checksums and try again.
> 
> That's probably not all that bad, but if this is all below a bunch of 
> layers of scripts, users will have to do some work on their end to get 
> this working smoothly.
> 
> The point of the open item was (a) to make sure this is adequately 
> documented, for instance in the release notes, (b) to think about 
> technological solutions to simplify this, such as [0], and (c) to just 
> check the general feedback.
> 
> Nothing from [0] ended up being committed, so that part of obsolete. The 
> action for beta1 is (a).  And then for (c) perhaps monitor the feedback 
> between beta1 and beta2.
> 
> 
> [0]: https://www.postgresql.org/message-id/flat/57957aca-3eae-4106- 
> afb2-3008122b9950%40eisentraut.org

Ping: It's time to do something about this open item. (Or decide to do 
nothing I guess). We're already in beta, but at the same time, we're 
still early in the beta and now is the last chance for code changes 
before 18 is shipped.

Aside from just documenting it, I see two things we could do:

1. Have pg_upgrade run initdb for you. It's always felt silly that you 
need to run initdb with the new version yourself, when there's really 
only one correct way to do it. pg_upgrade has all the checks to verify 
that you did it right, so why doesn't it just do it itself? I think 
that'd be a good long-term solution. Might be too late for 18, but I'm 
not sure. If someone wrote the patch we could evaluate it. To use that 
mode, the scripts calling pg_upgrade would need to be changed, though, 
so we'd perhaps want to do #2 or something else in addition to this.

2. If the new cluster has checksums enabled, but the old one has them 
disabled, have pg_upgrade disable checksums in the new cluster.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




Re: Enable data checksums by default

От
Daniel Gustafsson
Дата:
> On 23 May 2025, at 10:10, Heikki Linnakangas <hlinnaka@iki.fi> wrote:

> Aside from just documenting it, I see two things we could do:
>
> 1. Have pg_upgrade run initdb for you. It's always felt silly that you need to run initdb with the new version
yourself,when there's really only one correct way to do it. pg_upgrade has all the checks to verify that you did it
right,so why doesn't it just do it itself? I think that'd be a good long-term solution. Might be too late for 18, but
I'mnot sure. If someone wrote the patch we could evaluate it. To use that mode, the scripts calling pg_upgrade would
needto be changed, though, so we'd perhaps want to do #2 or something else in addition to this. 

I can see this being desired longer term, but as you mention there is likely to
be many moving parts outside of our immediate control making it much harder
than just adding the call to initdb.  It doesn't seem like a post-beta patch to
me given the implications for packagers and others in the ecosystem.

> 2. If the new cluster has checksums enabled, but the old one has them disabled, have pg_upgrade disable checksums in
thenew cluster. 

IF we do this it should be Very visible, since a user otherwise might think
that their upgraded cluster will have checksums since they added them in
initdb.

I think we should document how to deal with checksums in upgrades, and perhaps
even tweak the errormessage in the pg_upgrade check with explanatory comments
if needed, and leave the functionality as is today.

--
Daniel Gustafsson




Re: Enable data checksums by default

От
Tomas Vondra
Дата:

On 5/23/25 11:22, Peter Eisentraut wrote:
> On 23.05.25 10:10, Heikki Linnakangas wrote:
>>> The point of the open item was (a) to make sure this is adequately
>>> documented, for instance in the release notes, (b) to think about
>>> technological solutions to simplify this, such as [0], and (c) to
>>> just check the general feedback.
>>>
>>> Nothing from [0] ended up being committed, so that part of obsolete.
>>> The action for beta1 is (a).  And then for (c) perhaps monitor the
>>> feedback between beta1 and beta2.
>>>
>>>
>>> [0]: https://www.postgresql.org/message-id/flat/57957aca-3eae-4106-
>>> afb2-3008122b9950%40eisentraut.org
>>
>> Ping: It's time to do something about this open item. (Or decide to do
>> nothing I guess). We're already in beta, but at the same time, we're
>> still early in the beta and now is the last chance for code changes
>> before 18 is shipped.
>>
>> Aside from just documenting it,
> 
> We don't currently have anything in the release notes that calls this
> out as a potential upgrading issue, so I propose the attached patch.
> 

Seems reasonable, although maybe it should say

  ... so if the old cluster does not have checksums enabled ...

instead of

  ... so if the old cluster was initialized without checksums ...

I mean, what matters is the current state, not how it was initialized.

>> I see two things we could do:
>>
>> 1. Have pg_upgrade run initdb for you. It's always felt silly that you
>> need to run initdb with the new version yourself, when there's really
>> only one correct way to do it. pg_upgrade has all the checks to verify
>> that you did it right, so why doesn't it just do it itself? I think
>> that'd be a good long-term solution. Might be too late for 18, but I'm
>> not sure. If someone wrote the patch we could evaluate it. To use that
>> mode, the scripts calling pg_upgrade would need to be changed, though,
>> so we'd perhaps want to do #2 or something else in addition to this.
>>
>> 2. If the new cluster has checksums enabled, but the old one has them
>> disabled, have pg_upgrade disable checksums in the new cluster.
> 
> These would alter the pg_upgrade workflow in significant ways, so I
> don't think this would be appropriate to change now.  So far I haven't
> heard any feedback about this, so I'm content with a documentation change.

How would #2 change pg_upgrade workflow? Isn't the whole point of that
change to keep the current workflow working?

Also, I'm not sure if "no feedback about this" is reliable. I have no
clue if people did any significant testing. Maybe people did a lot of
testing and the current state is fine. But it's more likely there was
little testing, in which case "no feedback" says nothing.

FWIW I would be +0.5 to just let pg_upgrade disable checksums.


regards

-- 
Tomas Vondra




Re: Enable data checksums by default

От
Tomas Vondra
Дата:
On 5/23/25 11:25, Daniel Gustafsson wrote:
>> On 23 May 2025, at 10:10, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> 
>> Aside from just documenting it, I see two things we could do:
>>
>> 1. Have pg_upgrade run initdb for you. It's always felt silly that you need to run initdb with the new version
yourself,when there's really only one correct way to do it. pg_upgrade has all the checks to verify that you did it
right,so why doesn't it just do it itself? I think that'd be a good long-term solution. Might be too late for 18, but
I'mnot sure. If someone wrote the patch we could evaluate it. To use that mode, the scripts calling pg_upgrade would
needto be changed, though, so we'd perhaps want to do #2 or something else in addition to this.
 
> 
> I can see this being desired longer term, but as you mention there is likely to
> be many moving parts outside of our immediate control making it much harder
> than just adding the call to initdb.  It doesn't seem like a post-beta patch to
> me given the implications for packagers and others in the ecosystem.
> 
>> 2. If the new cluster has checksums enabled, but the old one has them disabled, have pg_upgrade disable checksums in
thenew cluster.
 
> 
> IF we do this it should be Very visible, since a user otherwise might think
> that their upgraded cluster will have checksums since they added them in
> initdb.
> 


What counts as "very visible"? Would it be fine if the pg_upgrade docs
say this clearly, and pg_upgrade prints a warning? To me that seems
sufficient.

TBH I can't quite imagine people expecting checksums to just magically
appear after upgrade.

> I think we should document how to deal with checksums in upgrades, and perhaps
> even tweak the errormessage in the pg_upgrade check with explanatory comments
> if needed, and leave the functionality as is today.
> 

Isn't that just an unnecessary breakage of existing tooling? I mean,
there's pretty much just one thing the user can do to make it work, and
that's disabling checksums. Sure, they might also enable checksums on
the old cluster, but that makes the upgrade much longer, and presumably
they use pg_upgrade to upgrade quickly.

That being said, I don't feel very strongly about this, so if the
consensus is to just error-out, so be it.


regards

-- 
Tomas Vondra




Re: Enable data checksums by default

От
Daniel Gustafsson
Дата:
> On 23 May 2025, at 11:55, Tomas Vondra <tomas@vondra.me> wrote:
>
> On 5/23/25 11:25, Daniel Gustafsson wrote:
>>> On 23 May 2025, at 10:10, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>
>>> Aside from just documenting it, I see two things we could do:
>>>
>>> 1. Have pg_upgrade run initdb for you. It's always felt silly that you need to run initdb with the new version
yourself,when there's really only one correct way to do it. pg_upgrade has all the checks to verify that you did it
right,so why doesn't it just do it itself? I think that'd be a good long-term solution. Might be too late for 18, but
I'mnot sure. If someone wrote the patch we could evaluate it. To use that mode, the scripts calling pg_upgrade would
needto be changed, though, so we'd perhaps want to do #2 or something else in addition to this. 
>>
>> I can see this being desired longer term, but as you mention there is likely to
>> be many moving parts outside of our immediate control making it much harder
>> than just adding the call to initdb.  It doesn't seem like a post-beta patch to
>> me given the implications for packagers and others in the ecosystem.
>>
>>> 2. If the new cluster has checksums enabled, but the old one has them disabled, have pg_upgrade disable checksums
inthe new cluster. 
>>
>> IF we do this it should be Very visible, since a user otherwise might think
>> that their upgraded cluster will have checksums since they added them in
>> initdb.
>
> What counts as "very visible"? Would it be fine if the pg_upgrade docs
> say this clearly, and pg_upgrade prints a warning? To me that seems
> sufficient.

I was thinking about a warning during processing.

> TBH I can't quite imagine people expecting checksums to just magically
> appear after upgrade.

It would not be surprised if users expect checksums to be on after reading
(variations of) "checksums are now on by default" messaging.

>> I think we should document how to deal with checksums in upgrades, and perhaps
>> even tweak the errormessage in the pg_upgrade check with explanatory comments
>> if needed, and leave the functionality as is today.
>
> Isn't that just an unnecessary breakage of existing tooling? I mean,
> there's pretty much just one thing the user can do to make it work, and
> that's disabling checksums. Sure, they might also enable checksums on
> the old cluster, but that makes the upgrade much longer, and presumably
> they use pg_upgrade to upgrade quickly.

We already expect the new cluster to be created in the Right Way (which I agree
isn't very userfriendly and should be improved upon) so requiring this to be
Right is in line with existing tooling IMHO (for better or worse).  My concern
is that users will think data checksums are enabled after the upgrade, and will
be annoyed when finding out they're not.

--
Daniel Gustafsson




Re: Enable data checksums by default

От
Bruce Momjian
Дата:
On Fri, May 23, 2025 at 11:10:47AM +0300, Heikki Linnakangas wrote:
> On 24/04/2025 14:26, Peter Eisentraut wrote:
> > action for beta1 is (a).  And then for (c) perhaps monitor the feedback
> > between beta1 and beta2.
> > 
> 
> Ping: It's time to do something about this open item. (Or decide to do
> nothing I guess). We're already in beta, but at the same time, we're still
> early in the beta and now is the last chance for code changes before 18 is
> shipped.
> 
> Aside from just documenting it, I see two things we could do:
> 
> 1. Have pg_upgrade run initdb for you. It's always felt silly that you need
> to run initdb with the new version yourself, when there's really only one
> correct way to do it. pg_upgrade has all the checks to verify that you did
> it right, so why doesn't it just do it itself? I think that'd be a good
> long-term solution. Might be too late for 18, but I'm not sure. If someone
> wrote the patch we could evaluate it. To use that mode, the scripts calling
> pg_upgrade would need to be changed, though, so we'd perhaps want to do #2
> or something else in addition to this.

The pg_upgrade docs suggest why having pg_upgrade run initdb could be
limiting:

    5. Install extension shared object files

    Many extensions and custom modules, whether from contrib or another
    source, use shared object files (or DLLs), e.g., pgcrypto.so. If
    the old cluster used these, shared object files matching the new
    server binary must be installed in the new cluster, usually via
    operating system commands. Do not load the schema definitions,
    e.g., CREATE EXTENSION pgcrypto, because these will be duplicated
    from the old cluster. If extension updates are available,
    pg_upgrade will report this and create a script that can be run
    later to update them.

    6. Copy custom full-text search files

    Copy any custom full text search files (dictionary, synonym,
    thesaurus, stop words) from the old to the new cluster.

    7. Adjust authentication

    pg_upgrade will connect to the old and new servers several times,
    so you might want to set authentication to peer in pg_hba.conf
    or use a ~/.pgpass file (see Section 32.16).


I have also heard of cases where postgresql.conf must be modified for
pg_upgrade to succeed, and of course initdb installs postgresql.conf.

Should #5 be after #6?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.



Re: Enable data checksums by default

От
Peter Eisentraut
Дата:
On 23.05.25 11:43, Tomas Vondra wrote:
>> We don't currently have anything in the release notes that calls this
>> out as a potential upgrading issue, so I propose the attached patch.
>>
> Seems reasonable, although maybe it should say
> 
>    ... so if the old cluster does not have checksums enabled ...
> 
> instead of
> 
>    ... so if the old cluster was initialized without checksums ...
> 
> I mean, what matters is the current state, not how it was initialized.

committed with that change



Re: Enable data checksums by default

От
Christoph Berg
Дата:
Re: Heikki Linnakangas
> 1. Have pg_upgrade run initdb for you. It's always felt silly that you need
> to run initdb with the new version yourself, when there's really only one
> correct way to do it.

Fwiw, Debian's pg_upgradecluster script would be happy if that problem
would be solved. Currently, we have to do all sorts of inspection of
the old cluster to figure out the locale used, data checksums, and we
are likely missing some more properties that could carried over (like
wal segment size etc). Some of the info isn't even available from the
control file but needs a connection to the running server.

The problem exists for both the pg_upgrade and dump-restore upgrade
methods, so an "initdb --like /path/to/other/cluster" mode would be
handy to have. Perhaps including a "initdb --like --print-command"
mode.

The pg_{backup,restore}cluster scripts have the same problem. It's
unnecessarily complex to recreate an existing setup.

Maybe it would be enough if the initdb options used to create a
cluster would be stored in some file in the data dir. (Perhaps in a
new line in PG_VERSION? As part of the control file?)

Christoph



Re: Enable data checksums by default

От
Peter Eisentraut
Дата:
On 05.06.25 12:37, Christoph Berg wrote:
> Maybe it would be enough if the initdb options used to create a
> cluster would be stored in some file in the data dir.

That probably wouldn't help if the default behavior changed, as in the 
current case.




Re: Enable data checksums by default

От
Bruce Momjian
Дата:
On Fri, Jun  6, 2025 at 01:40:40PM +0200, Peter Eisentraut wrote:
> On 05.06.25 12:37, Christoph Berg wrote:
> > Maybe it would be enough if the initdb options used to create a
> > cluster would be stored in some file in the data dir.
> 
> That probably wouldn't help if the default behavior changed, as in the
> current case.

Well, technically we would know the default from the old version, so we
could infer the desired behavior.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.



Re: Enable data checksums by default

От
Tomas Vondra
Дата:
Hi!

So, what should we do with the PG18 open item? We (the RMT team) would
like to know if we shall keep the checksums enabled by default, and if
there's something that still needs to be done for PG18.

We don't have a strong opinion either way, but there doesn't seem to be
much happening on this thread. So if someone feels we should rethink and
disable the checksums by default for PG18, he/she needs to articulate
that proposal. The sooner the better.

The commit that flipped the default (04bec894a04c) is from Peter
Eisentraut, and while that doesn't make him solely responsible for the
change, we think it'd be good to know his opinion on this. But it seems
more like a change requiring a wider consensus, and I'm aware e.g.
Andres expressed some doubts about enabling checksums [2].

In any case, the default is "on" at the moment, and unless someone
advocates for changing it soon, that's what we'll get in PG19.



My personal opinion is keeping checksums enabled by default seems OK.
It's not perfect, but it does seem like a reasonable trade off to me.
I'm aware of three issues people mention as arguments against checksums:


1) performance

It does have performance impact, and it'll always have performance
impact. Even if we make the checksums infinitely fast, we'll still have
to do more writes etc. There's no way around that. And I think that's
fine/acceptable.

I'm sure there are cases of checksums causing unnecessary regressions.
Commit e6eed40e44419 [3] is a good example of such case. I don't know
how many such cases exist, but I don't think it's very many. We need to
find and fix those cases - we just never did, because checksums were off
by default.

In any case, the cost seems reasonable to me.


2) upgrade user experience

I think this was the other concern in this thread - that it'll be hard
for users to do upgrades (using pg_upgrade). But I believe this should
have been sorted out now, am I right?


3) checksums user experience

I'm aware dealing with checksums can be rough. That is, we don't have
great tooling to verify checksums on-line, and once you get a checksum
error, you're mostly on your own. Or rather, the solution is to fail
over to a standby or restore from a backup. Which is fine, and I don't
have a great idea/proposal what else to do.

If we require improving this before enabling checksums by default,
someone probably needs to describe what improvements are needed to make
that happen.



regards


[1]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=04bec894a04cb0d32533f1522ab81b7016141ff1

[2]
https://www.postgresql.org/message-id/brdaw5wke274lubirrl4v2k4qdacylvgwwqztifn7m27pkth3s%40rh7wie47pfcp

[3]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=e6eed40e44419e3268d01fe0d2daec08a7df68f7

-- 
Tomas Vondra




Re: Enable data checksums by default

От
Laurenz Albe
Дата:
On Tue, 2025-07-29 at 20:24 +0200, Tomas Vondra wrote:
> So, what should we do with the PG18 open item? We (the RMT team) would
> like to know if we shall keep the checksums enabled by default, and if
> there's something that still needs to be done for PG18.

I don't have a strong opinion, but I lean towards having them on
by default.  Yes, that has a performance impact (e.g. WAL logging hints),
and people with reliable storage may be better off disabling checksums.
But I think it is better to set the default values so that they are
good for those people who don't have a lot of PostgreSQL knowledge and
are running less demanding installations on mediocre hardware.

Yours,
Laurenz Albe



Re: Enable data checksums by default

От
Daniel Gustafsson
Дата:
> On 30 Jul 2025, at 11:58, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> 
> On Tue, 2025-07-29 at 20:24 +0200, Tomas Vondra wrote:
>> So, what should we do with the PG18 open item? We (the RMT team) would
>> like to know if we shall keep the checksums enabled by default, and if
>> there's something that still needs to be done for PG18.
> 
> I don't have a strong opinion, but I lean towards having them on
> by default.

I agree with that, while there might be a lot of cases where disabling
checksums is the right move it's still a sane default.

--
Daniel Gustafsson




Re: Enable data checksums by default

От
Greg Burd
Дата:

> On Jul 30, 2025, at 8:09 AM, Daniel Gustafsson <daniel@yesql.se> wrote:
>
>> On 30 Jul 2025, at 11:58, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>>
>> On Tue, 2025-07-29 at 20:24 +0200, Tomas Vondra wrote:
>>> So, what should we do with the PG18 open item? We (the RMT team) would
>>> like to know if we shall keep the checksums enabled by default, and if
>>> there's something that still needs to be done for PG18.
>>
>> I don't have a strong opinion, but I lean towards having them on
>> by default.
>
> I agree with that, while there might be a lot of cases where disabling
> checksums is the right move it's still a sane default.
>
> --
> Daniel Gustafsson

I realize I’m late to the conversation, I’ve been lurking...

I agree that enabling checksums by default is the sane default.  Databases
should always make a best effort for data integrity, checksums are a
positive step in that direction.

I recall a conversation at the last PGConf.dev (2025) with a representative
from Intel and Jeff Davis (CC’ed) that had to do with checksums and a vast
performance difference between Intel and AMD the latter winning by a mile.
I forget the details, maybe Jeff remembers more than I do.  I’m not
suggesting that we disable Intel by default or trying to derail this
conversation (which appears to be reaching consensus), just raising
awareness.

best,

-greg


Re: Enable data checksums by default

От
Tomas Vondra
Дата:

On 7/31/25 15:39, Greg Burd wrote:
> 
> 
>> On Jul 30, 2025, at 8:09 AM, Daniel Gustafsson <daniel@yesql.se> wrote:
>>
>>> On 30 Jul 2025, at 11:58, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>>>
>>> On Tue, 2025-07-29 at 20:24 +0200, Tomas Vondra wrote:
>>>> So, what should we do with the PG18 open item? We (the RMT team) would
>>>> like to know if we shall keep the checksums enabled by default, and if
>>>> there's something that still needs to be done for PG18.
>>>
>>> I don't have a strong opinion, but I lean towards having them on
>>> by default.
>>
>> I agree with that, while there might be a lot of cases where disabling
>> checksums is the right move it's still a sane default.
>>
>> --
>> Daniel Gustafsson
> 
> I realize I’m late to the conversation, I’ve been lurking...
> 
> I agree that enabling checksums by default is the sane default.  Databases
> should always make a best effort for data integrity, checksums are a
> positive step in that direction.
> 
> I recall a conversation at the last PGConf.dev (2025) with a representative
> from Intel and Jeff Davis (CC’ed) that had to do with checksums and a vast
> performance difference between Intel and AMD the latter winning by a mile.
> I forget the details, maybe Jeff remembers more than I do.  I’m not
> suggesting that we disable Intel by default or trying to derail this
> conversation (which appears to be reaching consensus), just raising
> awareness.
> 

I don't know the Intel vs. AMD situation exactly, but e.g. [1] does not
suggest AMD wins by a mile. In fact, it suggests Intel does much better
in this particular benchmark (with AVX-512 improvements). Of course,
this is a fairly recent *kernel* improvement, maybe it wouldn't work for
our data checksums that well.

However, I don't think the cost of the checksum calculation itself is
the main concern. It's probably negligible compared to all the other
costs, triggered by checksums - having to WAL-log hint bits, doing more
expensive checks (that's what the btree regression was about), etc.

[1] https://www.phoronix.com/news/Linux-CRC32C-VPCLMULQDQ

cheers

-- 
Tomas Vondra




Re: Enable data checksums by default

От
Laurenz Albe
Дата:
On Thu, 2025-07-31 at 09:39 -0400, Greg Burd wrote:
> I agree that enabling checksums by default is the sane default.  Databases
> should always make a best effort for data integrity, checksums are a
> positive step in that direction.

Having checksums on does not improve data integrity...

Yours,
Laurenz Albe



Re: Enable data checksums by default

От
Laurenz Albe
Дата:
On Thu, 2025-07-31 at 09:39 -0400, Greg Burd wrote:
> I agree that enabling checksums by default is the sane default.  Databases
> should always make a best effort for data integrity, checksums are a
> positive step in that direction.

Having checksums on does not improve data integrity...

Yours,
Laurenz Albe



Re: Enable data checksums by default

От
Laurenz Albe
Дата:
On Thu, 2025-07-31 at 09:39 -0400, Greg Burd wrote:
> I agree that enabling checksums by default is the sane default.  Databases
> should always make a best effort for data integrity, checksums are a
> positive step in that direction.

Having checksums on does not improve data integrity...

Yours,
Laurenz Albe



Re: Enable data checksums by default

От
Greg Sabino Mullane
Дата:
On Thu, Jul 31, 2025 at 3:21 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
Having checksums on does not improve data integrity...

Maybe not directly, but in the same way that a smoke detector does not directly prevent fire damage. Checksums do allow you to detect problems and fix things (esp. bad hardware) before they get worse.

Cheers,
Greg

--
Enterprise Postgres Software Products & Tech Support

Re: Enable data checksums by default

От
Jeff Davis
Дата:
On Thu, 2025-07-31 at 17:21 +0200, Tomas Vondra wrote:
> On 7/31/25 15:39, Greg Burd wrote:
> > I recall a conversation at the last PGConf.dev (2025) with a
> > representative
> > from Intel and Jeff Davis (CC’ed) that had to do with checksums and
> > a vast
> > performance difference between Intel and AMD the latter winning by
> > a mile.
>
> I don't know the Intel vs. AMD situation exactly, but e.g. [1] does
> not
> suggest AMD wins by a mile. In fact, it suggests Intel does much
> better
> in this particular benchmark (with AVX-512 improvements). Of course,
> this is a fairly recent *kernel* improvement, maybe it wouldn't work
> for
> our data checksums that well.
>
> However, I don't think the cost of the checksum calculation itself is
> the main concern. It's probably negligible compared to all the other
> costs, triggered by checksums - having to WAL-log hint bits, doing
> more
> expensive checks (that's what the btree regression was about), etc.

The issue Greg and I discussed, explained to me earlier by Andres, was
a memory bandwidth issue.

IIRC (Andres please correct me): The new IO infrastructure enables us
to bypass a memory copy (from userspace to kernel space) when writing
out a page. Unfortunately, checksums require reading the data to
calculate the checksum, which effectively defeats that optimization.

Those memory copies mostly happen in the bgwriter, where the page isn't
generally in the cache, which means that memory bandwidth can become
the bottleneck. Intel seems to have poor per-core memory bandwidth
compared with AMD:

https://sites.utexas.edu/jdm4372/2023/04/25/the-evolution-of-single-core-bandwidth-in-multicore-processors/

so it's more likely to become the bottleneck on Intel.

That lead to an interesting discussion about calculating the checksum
on a page in the backend eagerly when it dirties a page, while it's
still in cache. As you point out, that's quite cheap.

Regards,
    Jeff Davis




Re: Enable data checksums by default

От
"Burd, Greg"
Дата:
> On Jul 31, 2025, at 3:21 PM, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> On Thu, 2025-07-31 at 09:39 -0400, Greg Burd wrote:
>> I agree that enabling checksums by default is the sane default.  Databases
>> should always make a best effort for data integrity, checksums are a
>> positive step in that direction.
>
> Having checksums on does not improve data integrity...
>
> Yours,
> Laurenz Albe

Hello, thanks for your reply.  I agree they don't improve integrity, but they
do improve the ability to detect loss of integrity (corruption), which is a
good thing for databases to do by default.  Apologies, my phrasing could have
been better.

best.

-greg


Re: Enable data checksums by default

От
"Burd, Greg"
Дата:

> On Jul 31, 2025, at 6:10 PM, Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Thu, 2025-07-31 at 17:21 +0200, Tomas Vondra wrote:
>> On 7/31/25 15:39, Greg Burd wrote:
>>> I recall a conversation at the last PGConf.dev (2025) with a
>>> representative
>>> from Intel and Jeff Davis (CC’ed) that had to do with checksums and
>>> a vast
>>> performance difference between Intel and AMD the latter winning by
>>> a mile.
>>
>> I don't know the Intel vs. AMD situation exactly, but e.g. [1] does
>> not
>> suggest AMD wins by a mile. In fact, it suggests Intel does much
>> better
>> in this particular benchmark (with AVX-512 improvements). Of course,
>> this is a fairly recent *kernel* improvement, maybe it wouldn't work
>> for
>> our data checksums that well.
>>
>> However, I don't think the cost of the checksum calculation itself is
>> the main concern. It's probably negligible compared to all the other
>> costs, triggered by checksums - having to WAL-log hint bits, doing
>> more
>> expensive checks (that's what the btree regression was about), etc.
>
> The issue Greg and I discussed, explained to me earlier by Andres, was
> a memory bandwidth issue.
>
> IIRC (Andres please correct me): The new IO infrastructure enables us
> to bypass a memory copy (from userspace to kernel space) when writing
> out a page. Unfortunately, checksums require reading the data to
> calculate the checksum, which effectively defeats that optimization.
>
> Those memory copies mostly happen in the bgwriter, where the page isn't
> generally in the cache, which means that memory bandwidth can become
> the bottleneck. Intel seems to have poor per-core memory bandwidth
> compared with AMD:
>
> https://sites.utexas.edu/jdm4372/2023/04/25/the-evolution-of-single-core-bandwidth-in-multicore-processors/
>
> so it's more likely to become the bottleneck on Intel.
>
> That lead to an interesting discussion about calculating the checksum
> on a page in the backend eagerly when it dirties a page, while it's
> still in cache. As you point out, that's quite cheap.
>
> Regards,
> Jeff Davis

Thanks Jeff for filling in the gaps in my memory. :)

-greg




Re: Enable data checksums by default

От
Ants Aasma
Дата:
On Thu, 31 Jul 2025 at 18:21, Tomas Vondra <tomas@vondra.me> wrote:
> I don't know the Intel vs. AMD situation exactly, but e.g. [1] does not
> suggest AMD wins by a mile. In fact, it suggests Intel does much better
> in this particular benchmark (with AVX-512 improvements). Of course,
> this is a fairly recent *kernel* improvement, maybe it wouldn't work for
> our data checksums that well.

Page checksums are not CRC, but a custom FNV inspired algorithm that
rearranges the calculation into 32 parallel ones to extract more
instruction level parallelism. With recent improvements in execution
capability this is still instruction latency bound - Zen 5 could
execute it 3x faster if we widened it 4x. It is especially bound on
Intel, as they decided soon after we implemented this algorithm to
increase the latency of vpmulld to 10, compared to 3 on AMD. This
requires compiling for a target that supports the wide instructions,
so it could really use runtime CPU detection to switch between
different SIMD width implementations.

Even if we made the checksum algorithm itself faster, the main issue
is actually memory bandwidth. Intel server CPUs have about half the
bandwidth of AMD ones. A checksum has to pull in the whole page in a
few hundred cycles. Without checksums only a part of the page might be
accessed and the accesses are spread over a longer time, making them
easier to hide by out-of-order execution.

But all the above still ends up at being a few hundred nanoseconds per
buffer read. Basically this ends up only mattering measurably for
in-RAM but out of shared buffers workloads. And the easy workaround is
to increase shared buffers. As you said, the main issue is the other
overheads that checksums pull in.

--
Ants Aasma



Re: Enable data checksums by default

От
Jim Nasby
Дата:


On Fri, Aug 1, 2025 at 6:37 AM Ants Aasma <ants.aasma@cybertec.at> wrote:
Even if we made the checksum algorithm itself faster, the main issue
is actually memory bandwidth. Intel server CPUs have about half the
bandwidth of AMD ones. A checksum has to pull in the whole page in a
few hundred cycles. Without checksums only a part of the page might be
accessed and the accesses are spread over a longer time, making them
easier to hide by out-of-order execution.

But all the above still ends up at being a few hundred nanoseconds per
buffer read. Basically this ends up only mattering measurably for
in-RAM but out of shared buffers workloads. And the easy workaround is
to increase shared buffers. As you said, the main issue is the other
overheads that checksums pull in.

I want to point out that at some point in time there might well be demand for checksumming pages living in shared_buffers. Modern storage systems assume that the durable media is going to have errors and already have robust ways to detect that. But they also assume that ECC memory is bulletproof (it's not), and that's the biggest benefit to Postgres checksums: they protect data in the filesystem cache[1]. You obviously lose that if you size shared_buffers to consume most of available memory.

Obviously trying to address that is way beyond the scope of what's being discussed here. I'm honestly unsure of how relevant it is, but I wanted to make sure folks were aware of it.

1: I can't go into details, but I have seen a case where Postgres checksums led to an investigation that ultimately revealed a memory-related issue. In other words, data was actually getting corrupted while in the filesystem cache. Obviously data could (and likely was) also get corrupted in shared buffers, but the corruption in the FS cache was what prompted the investigation that ultimately found the hardware issue. Fortunately shared_buffers was small enough to make it more likely that corruption would happen outside of Postgres, so it could be detected.

Re: Enable data checksums by default

От
Tomas Vondra
Дата:
On 7/29/25 20:24, Tomas Vondra wrote:
> Hi!
> 
> So, what should we do with the PG18 open item? We (the RMT team) would
> like to know if we shall keep the checksums enabled by default, and if
> there's something that still needs to be done for PG18.
> 
> We don't have a strong opinion either way, but there doesn't seem to be
> much happening on this thread. So if someone feels we should rethink and
> disable the checksums by default for PG18, he/she needs to articulate
> that proposal. The sooner the better.
> 
> The commit that flipped the default (04bec894a04c) is from Peter
> Eisentraut, and while that doesn't make him solely responsible for the
> change, we think it'd be good to know his opinion on this. But it seems
> more like a change requiring a wider consensus, and I'm aware e.g.
> Andres expressed some doubts about enabling checksums [2].
> 
> In any case, the default is "on" at the moment, and unless someone
> advocates for changing it soon, that's what we'll get in PG19.
> 

I've moved the item to the "resolved before rc1" section.

Given the recent inactivity on this thread, and lack of proposals to
change the default for PG18 (to disable checksums by default), I think
we're going to keep the current default.

We may rethink, but there's not much time to consider such proposal.


regards

-- 
Tomas Vondra