Обсуждение: Some shared memory chunks are allocated even if related processes won't start

Поиск
Список
Период
Сортировка

Some shared memory chunks are allocated even if related processes won't start

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear hackers,

While reading codes, I found that ApplyLauncherShmemInit() and AutoVacuumShmemInit()
are always called even if they would not be launched.
It may be able to reduce the start time to avoid the unnecessary allocation.
However, I know this improvement would be quite small because the allocated chunks are
quite small.

Anyway, there are several ways to fix:

1)
Skip calling ShmemInitStruct() if the related process would not be launched.
I think this approach is the easiest way. E.g.,

```
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -962,6 +962,9 @@ ApplyLauncherShmemInit(void)
 {
     bool        found;

+    if (max_logical_replication_workers == 0 || IsBinaryUpgrade)
+        return;
+
```

2)
Dynamically allocate the shared memory. This was allowed by recent commit [1].
I made a small PoC only for logical launcher to show what I meant. PSA diff file.
Since some processes (backend, apply worker, parallel apply worker, and tablesync worker)
refers the chunk, codes for attachment must be added on the several places.

If you agree it should be fixed, I will create a patch. Thought?

[1]: https://github.com/postgres/postgres/commit/8b2bcf3f287c79eaebf724cba57e5ff664b01e06

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/


Вложения
"Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> writes:
> While reading codes, I found that ApplyLauncherShmemInit() and AutoVacuumShmemInit()
> are always called even if they would not be launched.
> It may be able to reduce the start time to avoid the unnecessary allocation.

Why would this be a good idea?  It would require preventing the
decision not to launch them from being changed later, except via
postmaster restart.  We've generally been trying to move away
from unchangeable-without-restart decisions.  In your example,

> +    if (max_logical_replication_workers == 0 || IsBinaryUpgrade)
> +        return;

max_logical_replication_workers is already PGC_POSTMASTER so there's
not any immediate loss of flexibility, but I don't think it's a great
idea to introduce another reason why it has to be PGC_POSTMASTER.

            regards, tom lane



RE: Some shared memory chunks are allocated even if related processes won't start

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Tom,

Thanks for replying!

> "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> writes:
> > While reading codes, I found that ApplyLauncherShmemInit() and
> AutoVacuumShmemInit()
> > are always called even if they would not be launched.
> > It may be able to reduce the start time to avoid the unnecessary allocation.
>
> Why would this be a good idea?  It would require preventing the
> decision not to launch them from being changed later, except via
> postmaster restart.

Right. It is important to relax their GucContext.

> We've generally been trying to move away
> from unchangeable-without-restart decisions.  In your example,
>
> > +    if (max_logical_replication_workers == 0 || IsBinaryUpgrade)
> > +        return;
>
> max_logical_replication_workers is already PGC_POSTMASTER so there's
> not any immediate loss of flexibility, but I don't think it's a great
> idea to introduce another reason why it has to be PGC_POSTMASTER.

You are right. The first example implied the max_logical_replication_workers
won't be changed. So it is not appropriate.
So ... what about second one? The approach allows to allocate a memory after
startup, which means that users may able to change the parameter from 0 to
natural number in future. (Of course, such an operation is prohibit for now).
Can it be an initial step to ease the condition?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/




Re: Some shared memory chunks are allocated even if related processes won't start

От
Alvaro Herrera
Дата:
On 2024-Mar-04, Hayato Kuroda (Fujitsu) wrote:

> Dear hackers,
> 
> While reading codes, I found that ApplyLauncherShmemInit() and
> AutoVacuumShmemInit() are always called even if they would not be
> launched.

Note that there are situations where the autovacuum launcher is started
even though autovacuum is nominally turned off, and I suspect your
proposal would break that.  IIRC this occurs when the Xid or multixact
counters cross the max_freeze_age threshold.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Porque Kim no hacía nada, pero, eso sí,
con extraordinario éxito" ("Kim", Kipling)



RE: Some shared memory chunks are allocated even if related processes won't start

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Alvaro,

Thanks for giving comments!

> > While reading codes, I found that ApplyLauncherShmemInit() and
> > AutoVacuumShmemInit() are always called even if they would not be
> > launched.
> 
> Note that there are situations where the autovacuum launcher is started
> even though autovacuum is nominally turned off, and I suspect your
> proposal would break that.  IIRC this occurs when the Xid or multixact
> counters cross the max_freeze_age threshold.

Right. In GetNewTransactionId(), SetTransactionIdLimit() and some other places,
PMSIGNAL_START_AUTOVAC_LAUNCHER is sent to postmaster when the xid exceeds
autovacuum_freeze_max_age. This work has already been written in the doc [1]:

```
To ensure that this does not happen, autovacuum is invoked on any table that
might contain unfrozen rows with XIDs older than the age specified by the
configuration parameter autovacuum_freeze_max_age. (This will happen even
if autovacuum is disabled.)
```

This means that my first idea won't work well. Even if the postmaster does not
initially allocate shared memory, backends may request to start auto vacuum and
use the region. However, the second idea is still valid, which allows the allocation
of shared memory dynamically. This is a bit efficient for the system which tuples
won't be frozen. Thought?

[1]: https://www.postgresql.org/docs/devel/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Re: Some shared memory chunks are allocated even if related processes won't start

От
'Alvaro Herrera'
Дата:
On 2024-Mar-04, Hayato Kuroda (Fujitsu) wrote:

> However, the second idea is still valid, which allows the allocation
> of shared memory dynamically. This is a bit efficient for the system
> which tuples won't be frozen. Thought?

I think it would be worth allocating AutoVacuumShmem->av_workItems using
dynamic shmem allocation, particularly to prevent workitems from being
discarded just because the array is full¹; but other than that, the
struct is just 64 bytes long so I doubt it's useful to allocate it
dynamically.

¹ I mean, if the array is full, just allocate another array, point to it
from the original one, and keep going.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"The problem with the facetime model is not just that it's demoralizing, but
that the people pretending to work interrupt the ones actually working."
                  -- Paul Graham, http://www.paulgraham.com/opensource.html



RE: Some shared memory chunks are allocated even if related processes won't start

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Alvaro,

Thanks for discussing!

> 
> I think it would be worth allocating AutoVacuumShmem->av_workItems using
> dynamic shmem allocation, particularly to prevent workitems from being
> discarded just because the array is full¹; but other than that, the
> struct is just 64 bytes long so I doubt it's useful to allocate it
> dynamically.
> 
> ¹ I mean, if the array is full, just allocate another array, point to it
> from the original one, and keep going.

OK, I understood that my initial proposal is not so valuable, so I can withdraw it.

About the suggetion, you imagined AutoVacuumRequestWork() and brininsert(),
right? I agreed it sounds good, but I don't think it can be implemented by current
interface. An interface for dynamically allocating memory is GetNamedDSMSegment(),
and it returns the same shared memory region if input names are the same.
Therefore, there is no way to re-alloc the shared memory.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Re: Some shared memory chunks are allocated even if related processes won't start

От
'Alvaro Herrera'
Дата:
Hello Hayato,

On 2024-Mar-04, Hayato Kuroda (Fujitsu) wrote:

> OK, I understood that my initial proposal is not so valuable, so I can
> withdraw it.

Yeah, that's what it seems to me.

> About the suggetion, you imagined AutoVacuumRequestWork() and
> brininsert(), right?

Correct.

> I agreed it sounds good, but I don't think it can be implemented by
> current interface. An interface for dynamically allocating memory is
> GetNamedDSMSegment(), and it returns the same shared memory region if
> input names are the same.  Therefore, there is no way to re-alloc the
> shared memory.

Yeah, I was imagining something like this: the workitem-array becomes a
struct, which has a name and a "next" pointer and a variable number of
workitem slots; the AutoVacuumShmem struct has a pointer to the first
workitem-struct and the last one; when a workitem is requested by
brininsert, we initially allocate via GetNamedDSMSegment("workitem-0") a
workitem-struct with a smallish number of elements; if we request
another workitem and the array is full, we allocate another array via
GetNamedDSMSegment("workitem-1") and store a pointer to it in workitem-0
(so that the list can be followed by an autovacuum worker that's
processing the database), and it's also set as the tail of the list in
AutoVacuumShmem (so that we know where to store further work items).
When all items in a workitem-struct are processed, we can free it
(I guess via dsm_unpin_segment), and make AutoVacuumShmem->av_workitems
point to the next one in the list.

This way, the "array" can grow arbitrarily.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/



RE: Some shared memory chunks are allocated even if related processes won't start

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Alvaro,

Thanks for giving comments!

> > I agreed it sounds good, but I don't think it can be implemented by
> > current interface. An interface for dynamically allocating memory is
> > GetNamedDSMSegment(), and it returns the same shared memory region if
> > input names are the same.  Therefore, there is no way to re-alloc the
> > shared memory.
> 
> Yeah, I was imagining something like this: the workitem-array becomes a
> struct, which has a name and a "next" pointer and a variable number of
> workitem slots; the AutoVacuumShmem struct has a pointer to the first
> workitem-struct and the last one; when a workitem is requested by
> brininsert, we initially allocate via GetNamedDSMSegment("workitem-0") a
> workitem-struct with a smallish number of elements; if we request
> another workitem and the array is full, we allocate another array via
> GetNamedDSMSegment("workitem-1") and store a pointer to it in workitem-0
> (so that the list can be followed by an autovacuum worker that's
> processing the database), and it's also set as the tail of the list in
> AutoVacuumShmem (so that we know where to store further work items).
> When all items in a workitem-struct are processed, we can free it
> (I guess via dsm_unpin_segment), and make AutoVacuumShmem->av_workitems
> point to the next one in the list.
> 
> This way, the "array" can grow arbitrarily.
>

Basically sounds good. My concerns are:

* GetNamedDSMSegment() does not returns a raw pointer to dsm_segment. This means
  that it may be difficult to do dsm_unpin_segment on the caller side.
* dynamic shared memory is recorded in dhash (dsm_registry_table) and the entry
  won't be deleted. The reference for the chunk might be remained.

Overall, it may be needed that dsm_registry may be also extended. I do not start
working yet, so will share results after trying them.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Re: Some shared memory chunks are allocated even if related processes won't start

От
'Alvaro Herrera'
Дата:
On 2024-Mar-05, Hayato Kuroda (Fujitsu) wrote:

> Basically sounds good. My concerns are:
> 
> * GetNamedDSMSegment() does not returns a raw pointer to dsm_segment. This means
>   that it may be difficult to do dsm_unpin_segment on the caller side.

Maybe we don't need a "named" DSM segment at all, and instead just use
bare dsm segments (dsm_create and friends) or a DSA -- not sure.  But
see commit 31ae1638ce35, which removed use of a DSA in autovacuum/BRIN.
Maybe fixing this is just a matter of reverting that commit.  At the
time, there was a belief that DSA wasn't supported everywhere so we
couldn't use it for autovacuum workitem stuff, but I think our reliance
on DSA is now past the critical point.

BTW, we should turn BRIN autosummarization to be on by default.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Las mujeres son como hondas:  mientras más resistencia tienen,
 más lejos puedes llegar con ellas"  (Jonas Nightingale, Leap of Faith)