Обсуждение: Use PGAlignedBlock instead of "char buf[BLCKSZ]" in more places
Hi,
The commit 44cac934 replaced "char buf[BLCKSZ]" with PGAlignedBlock to avoid issues on alignment-picky hardware. While it replaced most of the instances, there are still some more left. How about we use PGAlignedBlock there too, something like the attached patch? A note [2] in the commit 44cac934 says that ensuring proper alignment makes kernel data transfers fasters and the left-over "char buf[BLCKSZ]" either do read() or write() system calls, so it might be worth to align them with PGAlignedBlock.
Thoughts?
PS: FWIW, I verified what difference actually char buf[BLCKSZ] and the union PGAlignedBlock does make with alignment with a sample code like [3] which gives a different alignment requirement, see below:
size of data 8192, alignment of data 1
size of data_aligned 8192, alignment of data_aligned 8
[1]
commit 44cac9346479d4b0cc9195b0267fd13eb4e7442c
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sat Sep 1 15:27:12 2018 -0400
Avoid using potentially-under-aligned page buffers.
[2]
I used these types even for variables where there's no risk of a
misaligned access, since ensuring proper alignment should make
kernel data transfers faster. I also changed some places where
we had been palloc'ing short-lived buffers, for coding style
uniformity and to save palloc/pfree overhead.
[3]
#include <stdio.h>
#define BLCKSZ 8192
typedef union PGAlignedBlock
{
char data[BLCKSZ];
double force_align_d;
long long int force_align_i64;
} PGAlignedBlock;
int main(int argc, char **argv)
{
char data[BLCKSZ];
PGAlignedBlock data_aligned;
printf("size of data %ld, alignment of data %ld\n", sizeof(data), _Alignof(data));
printf("size of data_aligned %ld, alignment of data_aligned %ld\n", sizeof(data_aligned), _Alignof(data_aligned));
return 0;
}
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
The commit 44cac934 replaced "char buf[BLCKSZ]" with PGAlignedBlock to avoid issues on alignment-picky hardware. While it replaced most of the instances, there are still some more left. How about we use PGAlignedBlock there too, something like the attached patch? A note [2] in the commit 44cac934 says that ensuring proper alignment makes kernel data transfers fasters and the left-over "char buf[BLCKSZ]" either do read() or write() system calls, so it might be worth to align them with PGAlignedBlock.
Thoughts?
PS: FWIW, I verified what difference actually char buf[BLCKSZ] and the union PGAlignedBlock does make with alignment with a sample code like [3] which gives a different alignment requirement, see below:
size of data 8192, alignment of data 1
size of data_aligned 8192, alignment of data_aligned 8
[1]
commit 44cac9346479d4b0cc9195b0267fd13eb4e7442c
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sat Sep 1 15:27:12 2018 -0400
Avoid using potentially-under-aligned page buffers.
[2]
I used these types even for variables where there's no risk of a
misaligned access, since ensuring proper alignment should make
kernel data transfers faster. I also changed some places where
we had been palloc'ing short-lived buffers, for coding style
uniformity and to save palloc/pfree overhead.
[3]
#include <stdio.h>
#define BLCKSZ 8192
typedef union PGAlignedBlock
{
char data[BLCKSZ];
double force_align_d;
long long int force_align_i64;
} PGAlignedBlock;
int main(int argc, char **argv)
{
char data[BLCKSZ];
PGAlignedBlock data_aligned;
printf("size of data %ld, alignment of data %ld\n", sizeof(data), _Alignof(data));
printf("size of data_aligned %ld, alignment of data_aligned %ld\n", sizeof(data_aligned), _Alignof(data_aligned));
return 0;
}
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Вложения
On Mon, Dec 04, 2023 at 06:59:13AM +0530, Bharath Rupireddy wrote: > The commit 44cac934 replaced "char buf[BLCKSZ]" with PGAlignedBlock to > avoid issues on alignment-picky hardware. While it replaced most of the > instances, there are still some more left. How about we use PGAlignedBlock > there too, something like the attached patch? A note [2] in the commit > 44cac934 says that ensuring proper alignment makes kernel data transfers > fasters and the left-over "char buf[BLCKSZ]" either do read() or write() > system calls, so it might be worth to align them with PGAlignedBlock. > > Thoughts? The buffers used to write the lock file and the TLI history file are not page buffers, and this could make code readers think that these are pages. So I am honestly not sure if there's a point in changing them because the current code is not incorrect, isn't it? It looks like 2042b3428d39 for the TLI history file and 52948169bcdd for the lock file began using BLCKSZ because that was just a handy thing to do, and because we know they would never get beyond that. -- Michael
Вложения
On 04.12.23 06:46, Michael Paquier wrote: > On Mon, Dec 04, 2023 at 06:59:13AM +0530, Bharath Rupireddy wrote: >> The commit 44cac934 replaced "char buf[BLCKSZ]" with PGAlignedBlock to >> avoid issues on alignment-picky hardware. While it replaced most of the >> instances, there are still some more left. How about we use PGAlignedBlock >> there too, something like the attached patch? A note [2] in the commit >> 44cac934 says that ensuring proper alignment makes kernel data transfers >> fasters and the left-over "char buf[BLCKSZ]" either do read() or write() >> system calls, so it might be worth to align them with PGAlignedBlock. >> >> Thoughts? > > The buffers used to write the lock file and the TLI history file are > not page buffers, and this could make code readers think that these > are pages. The type is called "aligned block", not "aligned buffer" or "aligned page", so I don't think it's incorrect to try to use it. So I am honestly not sure if there's a point in changing > them because the current code is not incorrect, isn't it? It looks > like 2042b3428d39 for the TLI history file and 52948169bcdd for the > lock file began using BLCKSZ because that was just a handy thing to > do, and because we know they would never get beyond that. Yeah, it's not clear why these need to be block-sized. We shouldn't perpetuate this without more clarity about this.
Hi, On 2023-12-04 15:53:44 +0100, Peter Eisentraut wrote: > On 04.12.23 06:46, Michael Paquier wrote: > > On Mon, Dec 04, 2023 at 06:59:13AM +0530, Bharath Rupireddy wrote: > > > The commit 44cac934 replaced "char buf[BLCKSZ]" with PGAlignedBlock to > > > avoid issues on alignment-picky hardware. While it replaced most of the > > > instances, there are still some more left. How about we use PGAlignedBlock > > > there too, something like the attached patch? A note [2] in the commit > > > 44cac934 says that ensuring proper alignment makes kernel data transfers > > > fasters and the left-over "char buf[BLCKSZ]" either do read() or write() > > > system calls, so it might be worth to align them with PGAlignedBlock. > > > > > > Thoughts? > > > > The buffers used to write the lock file and the TLI history file are > > not page buffers, and this could make code readers think that these > > are pages. > > The type is called "aligned block", not "aligned buffer" or "aligned page", > so I don't think it's incorrect to try to use it. Block is a type defined in bufmgr.h... > So I am honestly not sure if there's a point in changing > > them because the current code is not incorrect, isn't it? It looks > > like 2042b3428d39 for the TLI history file and 52948169bcdd for the > > lock file began using BLCKSZ because that was just a handy thing to > > do, and because we know they would never get beyond that. > > Yeah, it's not clear why these need to be block-sized. We shouldn't > perpetuate this without more clarity about this. If we change something, we should consider making buffers like these aligned to page sizes, rather than just MAXALIGNED. Greetings, Andres Freund
On Mon, Dec 04, 2023 at 09:47:24AM -0800, Andres Freund wrote: > If we change something, we should consider making buffers like these aligned > to page sizes, rather than just MAXALIGNED. You mean 4k kernel pages, right? That makes sense to me. -- Michael