Re: fallocate / posix_fallocate for new WAL file creation (etc...)
От | Merlin Moncure |
---|---|
Тема | Re: fallocate / posix_fallocate for new WAL file creation (etc...) |
Дата | |
Msg-id | CAHyXU0zwic2=qA9GFv6S4upUGXVWvRLwZNRqJ_v6U+ydBBc_Tg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: fallocate / posix_fallocate for new WAL file creation (etc...) (Andres Freund <andres@2ndquadrant.com>) |
Список | pgsql-hackers |
On Fri, May 17, 2013 at 4:18 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2013-05-17 15:48:38 -0500, Merlin Moncure wrote: >> On Fri, May 17, 2013 at 8:29 AM, Merlin Moncure <mmoncure@gmail.com> wrote: >> > On Fri, May 17, 2013 at 4:47 AM, Andres Freund <andres@2ndquadrant.com> wrote: >> >> On 2013-05-15 16:46:33 -0500, Jon Nelson wrote: >> >>> > * Is wal file creation performance actually relevant? Is the performance >> >>> > of a system running on fallocate()d wal files any different? >> >>> >> >>> In my limited testing, I noticed a drop of approx. 100ms per WAL file. >> >>> I do not have a good idea for how to really stress the WAL-file >> >>> creation area without calling pg_start_backup and pg_stop_backup over >> >>> and over (with archiving enabled). >> >> >> >> My point is that wal file creation usually isn't all that performance >> >> sensitive. Once the cluster has enough WAL files it will usually recycle >> >> them and thus never allocate new ones. So for this to be really >> >> beneficial it would be interesting to show different performance during >> >> normal running. You could also check out of how many extents a wal file >> >> is made out of with fallocate in comparison to the old style method >> >> (filefrag will give you that for most filesystems). >> > >> > But why does it have to be *really* beneficial? We're already making >> > optional posix_fxxx calls and fallocate seems to do exactly what we >> > would want in this context. Even if the 100ms drop doesn't show up >> > all that often, I'd still take it just for the defragmentation >> > benefits and the patch is fairly tiny. > > Well, it needs to be tested et al. And its a fairly critical code > path. I seem to remember that there were older glibc versions that > didn't do such a great job at emulating fallocate for example. > >> Here is sample output of filefrag on a somewhat busy database from our >> testing environment that exactly duplicates our production workloads.. >> It does a lot of batch processing at night and a mix of 80%oltp 20% >> olap during the day. This is on ext3. Interestingly, on ext4 servers >> I never saw more than 2 extents per file (but those servers are mostly >> not as busy). > > Ok, that's pretty bad. 490 extents in one file? Really? I'd consider > shutting down the cluster, copying the wal files in a moment where there > is enough free space. Just don't forget to sync afterwards. > EXT4 is notably better at allocating space in growing files than ext3 > due to delayed allocation (and other things), so it wouldn't surprise me > similar differences in fragmentation even if the load were comparable. > > Ext3 doesn't have fallocate btw, so it wouldn't benefit from such a > patch anyway. yeah -- I see your point. The object lesson isn't so much 'improve postgres' as it is to 'use a modern filesystem'. merlin
В списке pgsql-hackers по дате отправления: