Обсуждение: Idea to enhance pgbench by more modes to generate data (multi-TXNs, UNNEST, COPY BINARY)
Idea to enhance pgbench by more modes to generate data (multi-TXNs, UNNEST, COPY BINARY)
От
Boris Mironov
Дата:
Hello hackers,
For some of my specific hardware tests I needed to generate big databases well beyond RAM size. Hence I turned to pgbench tool and its default 2 modes for client- and server-side generation for TPC-B tests. When I use "scale" factor in range of few thousands (eg, 3000 - 5000) data generation phase takes quite some time. I looked at it as opportunity to prove/disprove 2 hypothesises:
- will INSERT mode work faster if we commit once every "scale" and turn single INSERT into "for" loop with commits for 3 tables in the end of each loop
- will "INSERT .. SELECT FROM unnest" be faster than "INSERT .. SELECT FROM generate_series"
- will BINARY mode work faster than TEXT even though we send much more data
- and so on
As a result of my experiments I produced significant patch for pgbench utility and though that it might be of interest not just for me. Therefore I'm sending draft version of it in diff format for current development tree on GitHub. As of November 11, 2025 I can merge with main branch of the project on GitHub.
Spoiler alert: "COPY FROM BINARY" is significantly faster than current "COPY FROM TEXT"
Would be happy to polish it if there is interest to such change.
Cheers
Вложения
Re: Idea to enhance pgbench by more modes to generate data (multi-TXNs, UNNEST, COPY BINARY)
От
Ashutosh Bapat
Дата:
Hi Boris, On Wed, Nov 12, 2025 at 3:25 AM Boris Mironov <boris_mironov@outlook.com> wrote: > > Hello hackers, > > For some of my specific hardware tests I needed to generate big databases well beyond RAM size. Hence I turned to pgbenchtool and its default 2 modes for client- and server-side generation for TPC-B tests. When I use "scale" factor inrange of few thousands (eg, 3000 - 5000) data generation phase takes quite some time. I looked at it as opportunity toprove/disprove 2 hypothesises: > > will INSERT mode work faster if we commit once every "scale" and turn single INSERT into "for" loop with commits for 3tables in the end of each loop > will "INSERT .. SELECT FROM unnest" be faster than "INSERT .. SELECT FROM generate_series" > will BINARY mode work faster than TEXT even though we send much more data > and so on > > As a result of my experiments I produced significant patch for pgbench utility and though that it might be of interestnot just for me. Therefore I'm sending draft version of it in diff format for current development tree on GitHub.As of November 11, 2025 I can merge with main branch of the project on GitHub. > > Spoiler alert: "COPY FROM BINARY" is significantly faster than current "COPY FROM TEXT" > > Would be happy to polish it if there is interest to such change. Making pgbench data initialization faster at a higher scale is desirable and the community might be willing to accept such a change. Running very large benchmarks is becoming common these days. However, it's not clear what you are proposing and what's the performance improvement. Answering following question may help: Your patch implements all the above methods? Do all of them provide performance improvement? If each of them performs better under certain conditions, what are those conditions? Is there one method which performs better than all others, which is it and why not implement just that method? What performance numbers are we looking at? Can the methods which use batch commits, also run those batches in parallel? -- Best Wishes, Ashutosh Bapat
Re: Idea to enhance pgbench by more modes to generate data (multi-TXNs, UNNEST, COPY BINARY)
От
Boris Mironov
Дата:
Hi Ashutosh,
> If there is one method that is better than all others, community will
> be more willing to accept implementation of that one method than
> multiple implementations so as to reduce maintenance burden.
Ok then. I'll leave "COPY FROM STDIN BINARY" implementation out of 3 only.
> multiple implementations so as to reduce maintenance burden.
Ok then. I'll leave "COPY FROM STDIN BINARY" implementation out of 3 only.
Would you prefer to replace original COPY FROM STDIN TEXT by this
code or add it as new "init-step" (e.g., with code "c")?
I also have noted that current code doesn't prevent pgbench parameter
like "--init-steps=dtgG". It allows to run data generation step twice.
Each of these "g" and "G" will present own timing in status line. Is this
an oversight or intentional?
> The code in the patch does not have enough comments. It's hard to
> understand the methods just from the code. Each of the generateData*
> functions could use a prologue explaining the data generation method
> it uses.
To add comments is not a problem at all. So far, it was just "code for myself"
> The code in the patch does not have enough comments. It's hard to
> understand the methods just from the code. Each of the generateData*
> functions could use a prologue explaining the data generation method
> it uses.
To add comments is not a problem at all. So far, it was just "code for myself"
and I was checking if there is any interest in community to include it.
>> I'm sure that much more testing is required to run this code under different
>> conditions and hardware to get a better picture. So far it looks very promising.
> Sure.
> Sure.
Cheers,
Boris
Re: Idea to enhance pgbench by more modes to generate data (multi-TXNs, UNNEST, COPY BINARY)
От
Ashutosh Bapat
Дата:
On Fri, Nov 14, 2025 at 8:51 PM Boris Mironov <boris_mironov@outlook.com> wrote: > > Hi Ashutosh, > > > If there is one method that is better than all others, community will > > be more willing to accept implementation of that one method than > > multiple implementations so as to reduce maintenance burden. > > Ok then. I'll leave "COPY FROM STDIN BINARY" implementation out of 3 only. > Would you prefer to replace original COPY FROM STDIN TEXT by this > code or add it as new "init-step" (e.g., with code "c")? > TEXT copy may be useful for cross platform client side data generation. BINARY might be useful for same platform client side generation or server side generation. Just a thought, use TEXT or BINARY automatically based on where it's cross-platform or same platform setup. > I also have noted that current code doesn't prevent pgbench parameter > like "--init-steps=dtgG". It allows to run data generation step twice. > Each of these "g" and "G" will present own timing in status line. Is this > an oversight or intentional? > I would review the commit a386942bd29b0ef0c9df061392659880d22cdf43 and the discussion thread https://postgr.es/m/alpine.DEB.2.21.1904061826420.3678@lancre mentioned in the commit message to find that out. At first glance it looks like an oversight, but I haven't reviewed the commit and thread myself. That thread might reveal why generate_series() was used instead of BINARY COPY for server side data generation. If it needs to change it's better to start a separate thread and separate patch for that discussion. -- Best Wishes, Ashutosh Bapat
Re: Idea to enhance pgbench by more modes to generate data (multi-TXNs, UNNEST, COPY BINARY)
От
Boris Mironov
Дата:
Hi Ashutosh,
> TEXT copy may be useful for cross platform client side data
> generation. BINARY might be useful for same platform client side
> generation or server side generation. Just a thought, use TEXT or
> BINARY automatically based on where it's cross-platform or same
> platform setup.
> generation. BINARY might be useful for same platform client side
> generation or server side generation. Just a thought, use TEXT or
> BINARY automatically based on where it's cross-platform or same
> platform setup.
It is true that BINARY format is not as flexible as TEXT. Postgres expects
data in wire to arrive in "network byte order". AFAIK only Solaris can
send its data without byte reordering. I support such exception via
#ifdef __sparc__
I don't see an easy way to make decision within pgbench on which
COPY mode to use TEXT or BINARY except specifically asking for one
via command line parameter. This is why I left flag "g" for TEXT mode
and added BINARY as "C" (for "Copy" and upper case as faster). I guess,
we can add alias "c" for old client-side generation as "copy", but slower
version of it.
While we are on topic of client- vs server-side generation. IMHO these are
quite misleading terms. Both of them are executed by RDBMS on server side,
but "server" one gets quite short query (and quite slow in execution) and
"client" one gets quite big network transfer (and quite fast in execution). The
reason is difference in data path that needs to reflected in documentation.
On top of it server-side thrashes DB cache, while client-side works via ring buffer
that doesn't allocate more than 1/8 of shared_buffers,
> I would review the commit a386942bd29b0ef0c9df061392659880d22cdf43 and
> the discussion thread
> https://postgr.es/m/alpine.DEB.2.21.1904061826420.3678@lancre
> mentioned in the commit message to find that out. At first glance it
> looks like an oversight, but I haven't reviewed the commit and thread
> myself. That thread might reveal why generate_series() was used
> instead of BINARY COPY for server side data generation. If it needs to
> change it's better to start a separate thread and separate patch for
> that discussion.
> the discussion thread
> https://postgr.es/m/alpine.DEB.2.21.1904061826420.3678@lancre
> mentioned in the commit message to find that out. At first glance it
> looks like an oversight, but I haven't reviewed the commit and thread
> myself. That thread might reveal why generate_series() was used
> instead of BINARY COPY for server side data generation. If it needs to
> change it's better to start a separate thread and separate patch for
> that discussion.
Thank you for this hint. I went through whole thread and there they discuss
how to reflect certain behavior of init-steps and nothing about COPY BINARY.
Major point of generate_series() introduction is to send short query to DB
and not to worry about network performance. It is quite true that COPY
sends tons of data over network and it might be an issue for slow network.
They also touched on topic of "one huge transaction" for whole generated
dataset or few smaller transaction.
Allow me to repost my benchmarks here (as it was lost for pgsql-hasckers
because I just used Reply instead of Reply-To-All)
Tests:
Test | Binary | Init mode | Query and details
-----|----------|-----------|-------------------------------------------------------
1 | original | G | INSERT FROM generate_series in single huge transaction
2 | enhanced | G | INSERT FROM generate_series in single huge transaction
3 | enhanced | i | INSERT FROM generate_series in one transaction per scale
4 | enhanced | I | INSERT FROM unnest in one transaction per scale
5 | original | g | COPY FROM STDIN TEXT in single transaction
6 | enhanced | g | COPY FROM STDIN TEXT in single transaction
7 | enhanced | C | COPY FROM STDIN BINARY in single transaction
Test | Scale and seconds to complete data generation
| 1 | 2 | 10 | 20 | 100 | 200 | 1000 | 2000
-----|------|------|------|------|-------|-------|--------|--------
1 | 0.19 | 0.37 | 2.01 | 4.34 | 22.58 | 46.64 | 245.98 | 525.99
2 | 0.30 | 0.47 | 2.18 | 4.37 | 25.38 | 56.66 | 240.89 | 482.63
3 | 0.18 | 0.39 | 2.14 | 4.19 | 23.78 | 47.63 | 240.91 | 483.19
4 | 0.18 | 0.38 | 2.17 | 4.39 | 23.68 | 47.93 | 242.63 | 487.33
5 | 0.11 | 0.22 | 1.46 | 2.95 | 15.69 | 32.86 | 154.16 | 311.00
6 | 0.11 | 0.22 | 1.43 | 2.89 | 16.01 | 29.41 | 158.10 | 307.54
7 | 0.14 | 0.12 | 0.56 | 1.16 | 6.22 | 12.70 | 64.70 | 135.58
"Original" binary is pgbench v17.6.
"Enhanced" binary is pgbench 19-devel with proposed patch.
As we can see another point of discussion in mentioned earlier
thread on pgsql-hackers said that multi transactions for init-step
do NOT bring any benefit. My numbers show some increase in
performance by simply INSERT-ing data in loop with one COMMIT
per "scale" on lower scales. On higher scales benefit dissapears. My
guess here is quite active process WAL archiver.
COPY TEXT is 36% faster than INSERT with multiple transactions.
COPY BINARY is ~72% faster than INSERT with multiple transactions.
At this point I'm torn between keeping old modes and logic for
backward compatibility and introduction of new modes for INSERT & COPY
versus simply replacing old less efficient logic with new one.
Sorry for quite long response.
Best regards,
Boris
From: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Sent: November 16, 2025 11:58 PM
To: Boris Mironov <boris_mironov@outlook.com>
Cc: pgsql-hackers@postgresql.org <pgsql-hackers@postgresql.org>
Subject: Re: Idea to enhance pgbench by more modes to generate data (multi-TXNs, UNNEST, COPY BINARY)
Sent: November 16, 2025 11:58 PM
To: Boris Mironov <boris_mironov@outlook.com>
Cc: pgsql-hackers@postgresql.org <pgsql-hackers@postgresql.org>
Subject: Re: Idea to enhance pgbench by more modes to generate data (multi-TXNs, UNNEST, COPY BINARY)
On Fri, Nov 14, 2025 at 8:51 PM Boris Mironov <boris_mironov@outlook.com> wrote:
>
> Hi Ashutosh,
>
> > If there is one method that is better than all others, community will
> > be more willing to accept implementation of that one method than
> > multiple implementations so as to reduce maintenance burden.
>
> Ok then. I'll leave "COPY FROM STDIN BINARY" implementation out of 3 only.
> Would you prefer to replace original COPY FROM STDIN TEXT by this
> code or add it as new "init-step" (e.g., with code "c")?
>
TEXT copy may be useful for cross platform client side data
generation. BINARY might be useful for same platform client side
generation or server side generation. Just a thought, use TEXT or
BINARY automatically based on where it's cross-platform or same
platform setup.
> I also have noted that current code doesn't prevent pgbench parameter
> like "--init-steps=dtgG". It allows to run data generation step twice.
> Each of these "g" and "G" will present own timing in status line. Is this
> an oversight or intentional?
>
I would review the commit a386942bd29b0ef0c9df061392659880d22cdf43 and
the discussion thread
https://postgr.es/m/alpine.DEB.2.21.1904061826420.3678@lancre
mentioned in the commit message to find that out. At first glance it
looks like an oversight, but I haven't reviewed the commit and thread
myself. That thread might reveal why generate_series() was used
instead of BINARY COPY for server side data generation. If it needs to
change it's better to start a separate thread and separate patch for
that discussion.
--
Best Wishes,
Ashutosh Bapat
>
> Hi Ashutosh,
>
> > If there is one method that is better than all others, community will
> > be more willing to accept implementation of that one method than
> > multiple implementations so as to reduce maintenance burden.
>
> Ok then. I'll leave "COPY FROM STDIN BINARY" implementation out of 3 only.
> Would you prefer to replace original COPY FROM STDIN TEXT by this
> code or add it as new "init-step" (e.g., with code "c")?
>
TEXT copy may be useful for cross platform client side data
generation. BINARY might be useful for same platform client side
generation or server side generation. Just a thought, use TEXT or
BINARY automatically based on where it's cross-platform or same
platform setup.
> I also have noted that current code doesn't prevent pgbench parameter
> like "--init-steps=dtgG". It allows to run data generation step twice.
> Each of these "g" and "G" will present own timing in status line. Is this
> an oversight or intentional?
>
I would review the commit a386942bd29b0ef0c9df061392659880d22cdf43 and
the discussion thread
https://postgr.es/m/alpine.DEB.2.21.1904061826420.3678@lancre
mentioned in the commit message to find that out. At first glance it
looks like an oversight, but I haven't reviewed the commit and thread
myself. That thread might reveal why generate_series() was used
instead of BINARY COPY for server side data generation. If it needs to
change it's better to start a separate thread and separate patch for
that discussion.
--
Best Wishes,
Ashutosh Bapat