Re: [HACKERS] MIT benchmarks pgsql multicore (up to 48)performance
От | Ivan Voras |
---|---|
Тема | Re: [HACKERS] MIT benchmarks pgsql multicore (up to 48)performance |
Дата | |
Msg-id | i8kfft$e5j$1@dough.gmane.org обсуждение исходный текст |
Ответ на | Re: [HACKERS] MIT benchmarks pgsql multicore (up to 48)performance (Robert Haas <robertmhaas@gmail.com>) |
Список | pgsql-performance |
On 10/07/10 02:39, Robert Haas wrote: > On Wed, Oct 6, 2010 at 6:31 PM, Ivan Voras<ivoras@freebsd.org> wrote: >> On 10/04/10 20:49, Josh Berkus wrote: >> >>>> The other major bottleneck they ran into was a kernel one: reading from >>>> the heap file requires a couple lseek operations, and Linux acquires a >>>> mutex on the inode to do that. The proper place to fix this is >>>> certainly in the kernel but it may be possible to work around in >>>> Postgres. >>> >>> Or we could complain to Kernel.org. They've been fairly responsive in >>> the past. Too bad this didn't get posted earlier; I just got back from >>> LinuxCon. >>> >>> So you know someone who can speak technically to this issue? I can put >>> them in touch with the Linux geeks in charge of that part of the kernel >>> code. >> >> Hmmm... lseek? As in "lseek() then read() or write()" idiom? It AFAIK >> cannot be fixed since you're modifying the global "strean position" >> variable and something has got to lock that. > > Well, there are lock free algorithms using CAS, no? Nothing is really "lock free" - in this case the algorithms simply push the locking down to atomic operations on the CPU (and the memory bus). Semantically, *something* has to lock the memory region for however brief period of time and then propagate that update to other CPUs' caches (i.e. invalidate them). >> OTOH, pread() / pwrite() don't have to do that. > > Hey, I didn't know about those. That sounds like it might be worth > investigating, though I confess I lack a 48-core machine on which to > measure the alleged benefit. As Jon said, it will in any case reduce the number of these syscalls by half, and they can be wrapped by a C macro for the platforms which don't implement them. http://man.freebsd.org/pread (and just in case it's needed: pread() is a special case of preadv()).
В списке pgsql-performance по дате отправления: