Re: Removing unneeded self joins
От | Andres Freund |
---|---|
Тема | Re: Removing unneeded self joins |
Дата | |
Msg-id | 20180517021934.m3kgn2lg3disruoi@alap3.anarazel.de обсуждение исходный текст |
Ответ на | Re: Removing unneeded self joins (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Removing unneeded self joins
|
Список | pgsql-hackers |
On 2018-05-16 22:11:22 -0400, Tom Lane wrote: > David Rowley <david.rowley@2ndquadrant.com> writes: > > On 17 May 2018 at 11:00, Andres Freund <andres@anarazel.de> wrote: > >> Wonder if we shouldn't just cache an estimated relation size in the > >> relcache entry till then. For planning purposes we don't need to be > >> accurate, and usually activity that drastically expands relation size > >> will trigger relcache activity before long. Currently there's plenty > >> workloads where the lseeks(SEEK_END) show up pretty prominently. > > > While I'm in favour of speeding that up, I think we'd get complaints > > if we used a stale value. > > Yeah, that scares me too. We'd then be in a situation where (arguably) > any relation extension should force a relcache inval. Not good. > I do not buy Andres' argument that the value is noncritical, either --- > particularly during initial population of a table, where the size could > go from zero to something-significant before autoanalyze gets around > to noticing. I don't think every extension needs to force a relcache inval. It'd instead be perfectly reasonable to define a rule that an inval is triggered whenever crossing a 10% relation size boundary. Which'll lead to invalidations for the first few pages, but much less frequently later. > I'm a bit skeptical of the idea of maintaining an accurate relation > size in shared memory, too. AIUI, a lot of the problem we see with > lseek(SEEK_END) has to do with contention inside the kernel for access > to the single-point-of-truth where the file's size is kept. Keeping > our own copy would eliminate kernel-call overhead, which can't hurt, > but it won't improve the contention angle. A syscall is several hundred instructions. An unlocked read - which'll be be sufficient in many cases, given that the value can quickly be out of date anyway - is a few cycles. Even with a barrier you're talking a few dozen cycles. So I can't see how it'd not improve the contention. But the main reason for keeping it in shmem is less the lseek avoidance - although that's nice, context switches aren't great - but to make relation extension need far less locking. Greetings, Andres Freund
В списке pgsql-hackers по дате отправления: