Michael, Jeff thanks for reviewing and testing.
> On 10 Dec 2015, at 02:16, Michael Paquier <michael.paquier@gmail.com> wrote:
>
> This has better be InvalidXLogRecPtr if unused.
Yes, that’s better. Changed.
> On 10 Dec 2015, at 02:16, Michael Paquier <michael.paquier@gmail.com> wrote:
> + if (gxact->prepare_lsn)
> + {
> + XlogReadTwoPhaseData(gxact->prepare_xlogptr, &buf, NULL);
> + }
> Perhaps you mean prepare_xlogptr here?
Yes, my bad. But funnily I have this error even number of times: code in CheckPointTwoPhase also uses prepare_lsn
insteadof xlogptr, so overall this was working well, that’s why it survived my own tests and probably Jeff’s tests.
I think that’s a bad variable naming, for example because lsn in pg_xlogdump points to start of the record, but here
startused as xloptr and end as lsn.
So changed both variables to prepare_start_lsn and prepare_end_lsn.
> On 10 Dec 2015, at 09:48, Jeff Janes <jeff.janes@gmail.com> wrote:
> I've tested this through my testing harness which forces the database
> to go through endless runs of crash recovery and checks for
> consistency, and so far it has survived perfectly.
Cool! I think that patch is most vulnerable to following type of workload: prepare transaction, do a lot of stuff with
databaseto force checkpoints (or even recovery cycles), and commit it.
> On 10 Dec 2015, at 09:48, Jeff Janes <jeff.janes@gmail.com> wrote:
> Can you give the full command line? -j, -c, etc.
pgbench -h testhost -i && pgbench -h testhost -f 2pc.pgb -T 300 -P 1 -c 64 -j 16 -r
where 2pc.pgb as in previous message.
Also all this applies to hosts with uniform memory. I tried to run patched postgres on NUMA with 60 physical cores and
patchdidn’t change anything. Perf top shows that main bottleneck is access to gxact, but on ordinary host with 1/2
cpu’sthat access even not in top ten heaviest routines.
> On 10 Dec 2015, at 09:48, Jeff Janes <jeff.janes@gmail.com> wrote:
> Why are you incrementing :scale ?
That’s a funny part, overall 2pc speed depends on how you will name your prepared transaction. Concretely I tried to
userandom numbers for gid’s and it was slower than having constantly incrementing gid. Probably that happens due to
linearsearch by gid in gxact array on commit. So I used :scale just as a counter, bacause it is initialised on pgbench
startand line like “\set scale :scale+1” works well. (may be there is a different way to do it in pgbench).
> I very rapidly reach a point where most of the updates are against
> tuples that don't exist, and then get integer overflow problems.
Hmm, that’s strange. Probably you set scale to big value, so that 100000*:scale is bigger that int4? But i thought that
pgbenchwill change aid columns to bigint if scale is more than 20000.
---
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company