Re: Flushing large data immediately in pqcomm

Поиск
Список
Период
Сортировка
От Melih Mutlu
Тема Re: Flushing large data immediately in pqcomm
Дата
Msg-id CAGPVpCTfzhiOCWPwpRpvV6EZU0egJix4jNObp_OkhfZESdPbFQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Flushing large data immediately in pqcomm  (Heikki Linnakangas <hlinnaka@iki.fi>)
Список pgsql-hackers
Hi Heikki,

Heikki Linnakangas <hlinnaka@iki.fi>, 29 Oca 2024 Pzt, 19:12 tarihinde şunu yazdı:
> Proposed change modifies socket_putmessage to send any data larger than
> 8K immediately without copying it into the send buffer. Assuming that
> the send buffer would be flushed anyway due to reaching its limit, the
> patch just gets rid of the copy part which seems unnecessary and sends
> data without waiting.

If there's already some data in PqSendBuffer, I wonder if it would be
better to fill it up with data, flush it, and then send the rest of the
data directly. Instead of flushing the partial data first. I'm afraid
that you'll make a tiny call to secure_write(), followed by a large one,
then a tine one again, and so forth. Especially when socket_putmessage
itself writes the msgtype and len, which are tiny, before the payload.

I agree that I could do better there without flushing twice for both PqSendBuffer and input data. PqSendBuffer always has some data, even if it's tiny, since msgtype and len are added.
 
Perhaps we should invent a new pq_putmessage() function that would take
an input buffer with 5 bytes of space reserved before the payload.
pq_putmessage() could then fill in the msgtype and len bytes in the
input buffer and send that directly. (Not wedded to that particular API,
but something that would have the same effect)

I thought about doing this. The reason why I didn't was because I think that such a change would require adjusting all input buffers wherever pq_putmessage is called, and I did not want to touch that many different places. These places where we need pq_putmessage might not be that many though, I'm not sure. 
 

> This change affects places where pq_putmessage is used such as
> pg_basebackup, COPY TO, walsender etc.
>
> I did some experiments to see how the patch performs.
> Firstly, I loaded ~5GB data into a table [1], then ran "COPY test TO
> STDOUT". Here are perf results of both the patch and HEAD > ...
> The patch brings a ~5% gain in socket_putmessage.
>
> [1]
> CREATE TABLE test(id int, name text, time TIMESTAMP);
> INSERT INTO test (id, name, time) SELECT i AS id, repeat('dummy', 100)
> AS name, NOW() AS time FROM generate_series(1, 100000000) AS i;

I'm surprised by these results, because each row in that table is < 600
bytes. PqSendBufferSize is 8kB, so the optimization shouldn't kick in in
that test. Am I missing something?

You're absolutely right. I made a silly mistake there. I also think that the way I did perf analysis does not make much sense, even if one row of the table is greater than 8kB.
Here are some quick timing results after being sure that it triggers this patch's optimization. I need to think more on how to profile this with perf. I hope to share proper results soon.

I just added a bit more zeros [1] and ran [2] (hopefully measured the correct thing)

HEAD:
real    2m48,938s
user    0m9,226s
sys     1m35,342s

Patch:
real    2m40,690s
user    0m8,492s
sys     1m31,001s

[1]
 INSERT INTO test (id, name, time) SELECT i AS id, repeat('dummy', 10000) AS name, NOW() AS time FROM generate_series(1, 1000000) AS i;

[2]
 rm /tmp/dummy && echo 3 | sudo tee /proc/sys/vm/drop_caches && time psql -d postgres -c "COPY test TO STDOUT;" > /tmp/dummy

Thanks,
--
Melih Mutlu
Microsoft

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: Bytea PL/Perl transform
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Possibility to disable `ALTER SYSTEM`