Re: [HACKERS] Support for pg_receivexlog --format=plain|tar
От | Michael Paquier |
---|---|
Тема | Re: [HACKERS] Support for pg_receivexlog --format=plain|tar |
Дата | |
Msg-id | CAB7nPqThgajSwPkQVfd3Bgu8s21=cnwBWBXBjyZLupq-EBj-=w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] Support for pg_receivexlog --format=plain|tar (Magnus Hagander <magnus@hagander.net>) |
Ответы |
Re: [HACKERS] Support for pg_receivexlog --format=plain|tar
|
Список | pgsql-hackers |
On Sat, Jan 7, 2017 at 8:19 PM, Magnus Hagander <magnus@hagander.net> wrote: > On Sat, Jan 7, 2017 at 12:31 AM, Michael Paquier <michael.paquier@gmail.com> > wrote: >> There is something I forgot. With this patch, >> FindStreamingStart()@pg_receivexlog.c is actually broken. In short it >> forgets to consider files that have been compressed at the last run of >> pg_receivexlog and will try to stream changes from the beginning. I >> can see that gzip -l provides this information... But I have yet to >> find something in zlib that allows a cheap lookup as startup of >> streaming should be fast. Looking at how gzip -l does it may be faster >> than looking at the docs. > > Do we really care though? As in, does startup of streaming have to be *that* > fast? Even gunziping 16Mb (worst case) doesn't exactly take a long time. If > your pg_receivexlog is restarting so often that it becomes a problem, I > think you already have another and much bigger problem on your hands. Based on some analysis, it is enough to look at the last 4 bytes of the compressed file to get the size output data with a single call to lseek() and then read(). So as there is a simple way to do things and that's far cheaper than decompressing perhaps hundred of segments I'd rather do it this way. Attached is the implementation. This code is using 2 booleans for 4 states of the file names: full non-compressed, partial non-compressed, full compressed and partial compressed. This keeps the last check of FindStreamingStart() more simple, but that's quite fancy lately to have an enum for such things :D > I found another problem with it -- it is completely broken in sync mode. You > need to either forbid sync mode together with compression, or teach > dir_sync() about it. The later would sound better, but I wonder how much > that's going to kill compression due to the small blocks? Is it a reasonable > use-case? Hm. Looking at the docs I think that --compress defined with --synchronous maps to the use of Z_SYNC_FLUSH with gzflush(). FWIW I don't have a direct use case for it, but it is not a major effort to add it, so done. There is no actual reason to forbid this combinations of options either. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
В списке pgsql-hackers по дате отправления: