Обсуждение: Memory buffer alignment
Hi, When analyzing the kernel profile from osdl dbt benchmarks, I noticed that around 50% of the kernel time is spent in __copy_user_intel. http://khack.osdl.org/stp/280060/profile/ This function is one of two functions that does the actual memory copy from/to kernel space to/from user space. Unfortunately it's the slower one: Intel cpus have a microcode fastpath for memcopies that are 8-byte aligned. This fastpath is around 50% faster than the manual copy that is used for "misaligned" (i.e. only 4-byte aligned) pointers. I don't know enough about other cpus, but I'd expect that most cpus prefer well-aligned buffers. How are the user space buffers allocated? So far I found buffile.c, but "struct BufFile.buffer" is at offset 32, i.e. aligned, although by chance. What is the alignment of the output of palloc? Is buffile.c the main code that reads/writes data to disk? -- Manfred
Manfred Spraul <manfred@colorfullife.com> writes: > Unfortunately it's the slower one: Intel cpus have a microcode fastpath > for memcopies that are 8-byte aligned. This fastpath is around 50% > faster than the manual copy that is used for "misaligned" (i.e. only > 4-byte aligned) pointers. Maybe it'd be worth setting MAXIMUM_ALIGNOF to 8 on such CPUs? Or at least hacking ShmemAlloc and friends to use 8-byte alignment. I assume the major issue here is that the shared buffers don't get 8-byte-aligned within the shared memory segment. Are there any machines where it'd be worth forcing an even larger alignment for the buffers? regards, tom lane
I found this very interested, and realize we have shared buffers aligned at 8-bytes in CVS. However, I know if I allocate an 8k block, it will usually be aligned on an 8k boundary, right? I know the i386 uses 4k memory pages, and it certainly seems like it would be a good idea to have the 8k buffers aligned on 4k offsets. Can someone run some tests to find out if there is any value to doing 4k offsets for shared buffer pages? I am also interested to see if any speed improvement can be seen with a MAXIMUM_ALIGNOF to 8. --------------------------------------------------------------------------- Tom Lane wrote: > Manfred Spraul <manfred@colorfullife.com> writes: > > Unfortunately it's the slower one: Intel cpus have a microcode fastpath > > for memcopies that are 8-byte aligned. This fastpath is around 50% > > faster than the manual copy that is used for "misaligned" (i.e. only > > 4-byte aligned) pointers. > > Maybe it'd be worth setting MAXIMUM_ALIGNOF to 8 on such CPUs? Or at > least hacking ShmemAlloc and friends to use 8-byte alignment. I assume > the major issue here is that the shared buffers don't get 8-byte-aligned > within the shared memory segment. > > Are there any machines where it'd be worth forcing an even larger > alignment for the buffers? > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > it certainly seems like it would be a good idea to have the 8k buffers > aligned on 4k offsets. Why? What mechanism do you expect would find that more efficient? regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > it certainly seems like it would be a good idea to have the 8k buffers > > aligned on 4k offsets. > > Why? What mechanism do you expect would find that more efficient? There was the idea that some OS's can swap the pages in from kernel into the user space. I am not sure any one does that, but it would be interesting to see. Also, a single shared buffer access would be a single virtual memory lookup, rather than two lookups. Not sure, but it would interesting to see. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073