Обсуждение: BufferUsage counters' values have changed

Поиск
Список
Период
Сортировка

BufferUsage counters' values have changed

От
Karina Litskevich
Дата:
Hi hackers,

I noticed that BufferUsage counters' values are strangely different for the
same queries on REL_15_STABLE and REL_16_STABLE. For example, I run

CREATE EXTENSION pg_stat_statements;
CREATE TEMP TABLE test(b int);
INSERT INTO test(b) SELECT generate_series(1,1000);
SELECT query, local_blks_hit, local_blks_read, local_blks_written,
       local_blks_dirtied, temp_blks_written FROM pg_stat_statements;


and output on REL_15_STABLE contains

query              | INSERT INTO test(b) SELECT generate_series($1,$2)
local_blks_hit     | 1005
local_blks_read    | 2
local_blks_written | 5
local_blks_dirtied | 5
temp_blks_written  | 0


while output on REL_16_STABLE contains

query              | INSERT INTO test(b) SELECT generate_series($1,$2)
local_blks_hit     | 1006
local_blks_read    | 0
local_blks_written | 0
local_blks_dirtied | 5
temp_blks_written  | 8



I found a bug that causes one of the differences. Wrong counter is incremented
in ExtendBufferedRelLocal(). The attached patch fixes it and should be applied
to REL_16_STABLE and master. With the patch applied output contains

query              | INSERT INTO test(b) SELECT generate_series($1,$2)
local_blks_hit     | 1006
local_blks_read    | 0
local_blks_written | 8
local_blks_dirtied | 5
temp_blks_written  | 0



I still wonder why local_blks_written is greater than it was on REL_15_STABLE,
and why local_blks_read became zero. These changes are caused by fcdda1e4b5.
This code is new to me, and I'm still trying to understand whether it's a bug
in computing the counters or just changes in how many blocks are read/written
during the query execution. If anyone can help me, I would be grateful.

Best regards,
Karina Litskevich
Postgres Professional: http://postgrespro.com/

Re: BufferUsage counters' values have changed

От
Karina Litskevich
Дата:

On Mon, Sep 11, 2023 at 9:08 AM Karina Litskevich <litskevichkarina@gmail.com> wrote:
I found a bug that causes one of the differences. Wrong counter is incremented
in ExtendBufferedRelLocal(). The attached patch fixes it and should be applied
to REL_16_STABLE and master.

 I've forgotten to attach the patch. Here it is.
Вложения

Re: BufferUsage counters' values have changed

От
Nazir Bilal Yavuz
Дата:
Hi,

On Mon, 11 Sept 2023 at 14:28, Karina Litskevich
<litskevichkarina@gmail.com> wrote:
>
> Hi hackers,
>
> I noticed that BufferUsage counters' values are strangely different for the
> same queries on REL_15_STABLE and REL_16_STABLE. For example, I run
>
> CREATE EXTENSION pg_stat_statements;
> CREATE TEMP TABLE test(b int);
> INSERT INTO test(b) SELECT generate_series(1,1000);
> SELECT query, local_blks_hit, local_blks_read, local_blks_written,
>        local_blks_dirtied, temp_blks_written FROM pg_stat_statements;
>
> and output on REL_15_STABLE contains
>
> query              | INSERT INTO test(b) SELECT generate_series($1,$2)
> local_blks_hit     | 1005
> local_blks_read    | 2
> local_blks_written | 5
> local_blks_dirtied | 5
> temp_blks_written  | 0
>
> while output on REL_16_STABLE contains
>
> query              | INSERT INTO test(b) SELECT generate_series($1,$2)
> local_blks_hit     | 1006
> local_blks_read    | 0
> local_blks_written | 0
> local_blks_dirtied | 5
> temp_blks_written  | 8
>
>
> I found a bug that causes one of the differences. Wrong counter is incremented
> in ExtendBufferedRelLocal(). The attached patch fixes it and should be applied
> to REL_16_STABLE and master. With the patch applied output contains

Nice finding! I agree, it should be changed.

> query              | INSERT INTO test(b) SELECT generate_series($1,$2)
> local_blks_hit     | 1006
> local_blks_read    | 0
> local_blks_written | 8
> local_blks_dirtied | 5
> temp_blks_written  | 0
>
>
> I still wonder why local_blks_written is greater than it was on REL_15_STABLE,
> and why local_blks_read became zero. These changes are caused by fcdda1e4b5.
> This code is new to me, and I'm still trying to understand whether it's a bug
> in computing the counters or just changes in how many blocks are read/written
> during the query execution. If anyone can help me, I would be grateful.

I spent some time on it:

local_blks_read became zero because:
1_ One more cache hit. It was supposed to be local_blks_read but it is
local_blks_hit now. This is an assumption, I didn't check this deeply.
2_ Before fcdda1e4b5, there was one local_blks_read coming from
buf = ReadBufferExtended(rel, VISIBILITYMAP_FORKNUM, blkno,
RBM_ZERO_ON_ERROR, NULL) in freespace.c -> ReadBuffer_common() ->
pgBufferUsage.local_blks_read++.
But buf = ReadBufferExtended(rel, VISIBILITYMAP_FORKNUM, blkno,
RBM_ZERO_ON_ERROR, NULL) is moved into the else case, so it didn't
called and local_blks_read isn't incremented.

local_blks_written is greater because of the combination of fcdda1e4b5
and 00d1e02be2.
In PG_15:
RelationGetBufferForTuple() -> ReadBufferBI(P_NEW, RBM_ZERO_AND_LOCK)
-> ReadBufferExtended() -> ReadBuffer_common() ->
pgBufferUsage.local_blks_written++; (called 5 times) [0]
In PG_16:
1_ 5 of the local_blks_written is coming from:
RelationGetBufferForTuple() -> RelationAddBlocks() ->
ExtendBufferedRelBy() -> ExtendBufferedRelCommon() ->
ExtendBufferedRelLocal() -> pgBufferUsage.local_blks_written +=
extend_by; (extend_by is 1, this is called 5 times) [1]
2_ 3 of the local_blks_written is coming from:
RelationGetBufferForTuple() -> RecordAndGetPageWithFreeSpace() ->
fsm_set_and_search() -> fsm_readbuf() -> fsm_extend() ->
ExtendBufferedRelTo() -> ExtendBufferedRelCommon() ->
ExtendBufferedRelLocal() -> pgBufferUsage.local_blks_written +=
extend_by; (extend_by is 3, this is called 1 time) [2]

I think [0] is the same path as [1] but [2] is new. 'fsm extends'
wasn't counted in local_blks_written in PG_15. Calling
ExtendBufferedRelTo() from fsm_extend() caused 'fsm extends' to be
counted in local_blks_written. I am not sure which one is correct.

I hope these help.

Regards,
Nazir Bilal Yavuz
Microsoft



Re: BufferUsage counters' values have changed

От
Andres Freund
Дата:
Hi,

On 2023-09-11 09:23:59 +0300, Karina Litskevich wrote:
> On Mon, Sep 11, 2023 at 9:08 AM Karina Litskevich <
> litskevichkarina@gmail.com> wrote:
> 
> > I found a bug that causes one of the differences. Wrong counter is
> > incremented
> > in ExtendBufferedRelLocal(). The attached patch fixes it and should be
> > applied
> > to REL_16_STABLE and master.
> >
> 
>  I've forgotten to attach the patch. Here it is.

> From 999a3d533a9b74c8568cc8a3d715c287de45dd2c Mon Sep 17 00:00:00 2001
> From: Karina Litskevich <litskevichkarina@gmail.com>
> Date: Thu, 7 Sep 2023 17:44:40 +0300
> Subject: [PATCH v1] Fix local_blks_written counter incrementation
> 
> ---
>  src/backend/storage/buffer/localbuf.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
> index 1735ec7141..567b8d15ef 100644
> --- a/src/backend/storage/buffer/localbuf.c
> +++ b/src/backend/storage/buffer/localbuf.c
> @@ -431,7 +431,7 @@ ExtendBufferedRelLocal(BufferManagerRelation bmr,
>  
>      *extended_by = extend_by;
>  
> -    pgBufferUsage.temp_blks_written += extend_by;
> +    pgBufferUsage.local_blks_written += extend_by;
>  
>      return first_block;
>  }
> -- 
> 2.34.1
> 

Ugh, you're right.

The naming of local vs temp here is pretty unfortunate imo. I wonder if we
ought to at least dd a comment to BufferUsage clarifying the situation? Just
reading the comments therein one would be hard pressed to figure out which of
the variables temp table activity should be added to.

I don't think we currently can write a test for this in the core tests, as the
relevant data isn't visible anywhere, iirc. Thus I added a test to
pg_stat_statements. Afaict it should be stable?

Running the attached patch through CI, planning to push after that succeeds,
unless somebody has a comment?

Greetings,

Andres Freund

Вложения

Re: BufferUsage counters' values have changed

От
Andres Freund
Дата:
Hi,

On 2023-09-13 16:04:00 +0300, Nazir Bilal Yavuz wrote:
> On Mon, 11 Sept 2023 at 14:28, Karina Litskevich
> <litskevichkarina@gmail.com> wrote:
> >
> > Hi hackers,
> >
> > I noticed that BufferUsage counters' values are strangely different for the
> > same queries on REL_15_STABLE and REL_16_STABLE. For example, I run
> >
> > CREATE EXTENSION pg_stat_statements;
> > CREATE TEMP TABLE test(b int);
> > INSERT INTO test(b) SELECT generate_series(1,1000);
> > SELECT query, local_blks_hit, local_blks_read, local_blks_written,
> >        local_blks_dirtied, temp_blks_written FROM pg_stat_statements;
> >
> > and output on REL_15_STABLE contains
> >
> > query              | INSERT INTO test(b) SELECT generate_series($1,$2)
> > local_blks_hit     | 1005
> > local_blks_read    | 2
> > local_blks_written | 5
> > local_blks_dirtied | 5
> > temp_blks_written  | 0
> >
> > while output on REL_16_STABLE contains
> >
> > query              | INSERT INTO test(b) SELECT generate_series($1,$2)
> > local_blks_hit     | 1006
> > local_blks_read    | 0
> > local_blks_written | 0
> > local_blks_dirtied | 5
> > temp_blks_written  | 8
> >
> >
> > I found a bug that causes one of the differences. Wrong counter is incremented
> > in ExtendBufferedRelLocal(). The attached patch fixes it and should be applied
> > to REL_16_STABLE and master. With the patch applied output contains
>
> Nice finding! I agree, it should be changed.
>
> > query              | INSERT INTO test(b) SELECT generate_series($1,$2)
> > local_blks_hit     | 1006
> > local_blks_read    | 0
> > local_blks_written | 8
> > local_blks_dirtied | 5
> > temp_blks_written  | 0
> >
> >
> > I still wonder why local_blks_written is greater than it was on REL_15_STABLE,
> > and why local_blks_read became zero. These changes are caused by fcdda1e4b5.
> > This code is new to me, and I'm still trying to understand whether it's a bug
> > in computing the counters or just changes in how many blocks are read/written
> > during the query execution. If anyone can help me, I would be grateful.
>
> I spent some time on it:
>
> local_blks_read became zero because:
> 1_ One more cache hit. It was supposed to be local_blks_read but it is
> local_blks_hit now. This is an assumption, I didn't check this deeply.
> 2_ Before fcdda1e4b5, there was one local_blks_read coming from
> buf = ReadBufferExtended(rel, VISIBILITYMAP_FORKNUM, blkno,
> RBM_ZERO_ON_ERROR, NULL) in freespace.c -> ReadBuffer_common() ->
> pgBufferUsage.local_blks_read++.
> But buf = ReadBufferExtended(rel, VISIBILITYMAP_FORKNUM, blkno,
> RBM_ZERO_ON_ERROR, NULL) is moved into the else case, so it didn't
> called and local_blks_read isn't incremented.

That imo is a legitimate difference / improvement. The read we previously did
here was unnecessary.


> local_blks_written is greater because of the combination of fcdda1e4b5
> and 00d1e02be2.
> In PG_15:
> RelationGetBufferForTuple() -> ReadBufferBI(P_NEW, RBM_ZERO_AND_LOCK)
> -> ReadBufferExtended() -> ReadBuffer_common() ->
> pgBufferUsage.local_blks_written++; (called 5 times) [0]
> In PG_16:
> 1_ 5 of the local_blks_written is coming from:
> RelationGetBufferForTuple() -> RelationAddBlocks() ->
> ExtendBufferedRelBy() -> ExtendBufferedRelCommon() ->
> ExtendBufferedRelLocal() -> pgBufferUsage.local_blks_written +=
> extend_by; (extend_by is 1, this is called 5 times) [1]
> 2_ 3 of the local_blks_written is coming from:
> RelationGetBufferForTuple() -> RecordAndGetPageWithFreeSpace() ->
> fsm_set_and_search() -> fsm_readbuf() -> fsm_extend() ->
> ExtendBufferedRelTo() -> ExtendBufferedRelCommon() ->
> ExtendBufferedRelLocal() -> pgBufferUsage.local_blks_written +=
> extend_by; (extend_by is 3, this is called 1 time) [2]
>
> I think [0] is the same path as [1] but [2] is new. 'fsm extends'
> wasn't counted in local_blks_written in PG_15. Calling
> ExtendBufferedRelTo() from fsm_extend() caused 'fsm extends' to be
> counted in local_blks_written. I am not sure which one is correct.

I think it's correct to count the fsm writes here. The pg_stat_statement
columns aren't specific to the main relation for or such... If anything it was
a bug to not count them before.

Thanks for looking into this.

Greetings,

Andres Freund



Re: BufferUsage counters' values have changed

От
Andres Freund
Дата:
Hi,

On 2023-09-13 11:59:39 -0700, Andres Freund wrote:
> Running the attached patch through CI, planning to push after that succeeds,
> unless somebody has a comment?

And pushed.

Thanks Karina for the report and fix!

Greetings,

Andres Freund



Re: BufferUsage counters' values have changed

От
Karina Litskevich
Дата:
Nazir, Andres, thank you both for help!

On Wed, Sep 13, 2023 at 10:10 PM Andres Freund <andres@anarazel.de> wrote:
On 2023-09-13 16:04:00 +0300, Nazir Bilal Yavuz wrote:
> local_blks_read became zero because:
> 1_ One more cache hit. It was supposed to be local_blks_read but it is
> local_blks_hit now. This is an assumption, I didn't check this deeply.
> 2_ Before fcdda1e4b5, there was one local_blks_read coming from
> buf = ReadBufferExtended(rel, VISIBILITYMAP_FORKNUM, blkno,
> RBM_ZERO_ON_ERROR, NULL) in freespace.c -> ReadBuffer_common() ->
> pgBufferUsage.local_blks_read++.
> But buf = ReadBufferExtended(rel, VISIBILITYMAP_FORKNUM, blkno,
> RBM_ZERO_ON_ERROR, NULL) is moved into the else case, so it didn't
> called and local_blks_read isn't incremented.

That imo is a legitimate difference / improvement. The read we previously did
here was unnecessary.


> local_blks_written is greater because of the combination of fcdda1e4b5
> and 00d1e02be2.
> In PG_15:
> RelationGetBufferForTuple() -> ReadBufferBI(P_NEW, RBM_ZERO_AND_LOCK)
> -> ReadBufferExtended() -> ReadBuffer_common() ->
> pgBufferUsage.local_blks_written++; (called 5 times) [0]
> In PG_16:
> 1_ 5 of the local_blks_written is coming from:
> RelationGetBufferForTuple() -> RelationAddBlocks() ->
> ExtendBufferedRelBy() -> ExtendBufferedRelCommon() ->
> ExtendBufferedRelLocal() -> pgBufferUsage.local_blks_written +=
> extend_by; (extend_by is 1, this is called 5 times) [1]
> 2_ 3 of the local_blks_written is coming from:
> RelationGetBufferForTuple() -> RecordAndGetPageWithFreeSpace() ->
> fsm_set_and_search() -> fsm_readbuf() -> fsm_extend() ->
> ExtendBufferedRelTo() -> ExtendBufferedRelCommon() ->
> ExtendBufferedRelLocal() -> pgBufferUsage.local_blks_written +=
> extend_by; (extend_by is 3, this is called 1 time) [2]
>
> I think [0] is the same path as [1] but [2] is new. 'fsm extends'
> wasn't counted in local_blks_written in PG_15. Calling
> ExtendBufferedRelTo() from fsm_extend() caused 'fsm extends' to be
> counted in local_blks_written. I am not sure which one is correct.

I think it's correct to count the fsm writes here. The pg_stat_statement
columns aren't specific to the main relation for or such... If anything it was
a bug to not count them before.


Best regards,
Karina Litskevich
Postgres Professional: http://postgrespro.com/