Обсуждение: Re: Increase of maintenance_work_mem limit in 64-bit Windows

Поиск
Список
Период
Сортировка

Re: Increase of maintenance_work_mem limit in 64-bit Windows

От
David Rowley
Дата:
On Fri, 20 Sept 2024 at 01:55, Пополитов Владлен
<v.popolitov@postgrespro.ru> wrote:
> Currently PostgreSQL built on 64-bit Windows has 2Gb limit for
> GUC variables due to sizeof(long)==4 used by Windows compilers.
> Technically 64-bit addressing for maintenance_work_mem is possible,
> but code base historically uses variables and constants of type "long",
> when process maintenance_work_mem value.

I agree. Ideally, we shouldn't use longs for anything ever. We should
likely adopt trying to remove the usages of them when possible.

I'd like to suggest you go about this patch slightly differently with
the end goal of removing the limitation from maintenance_work_mem,
work_mem, autovacuum_work_mem and logical_decoding_work_mem.

Patch 0001: Add a macro named something like WORK_MEM_KB_TO_BYTES()
and adjust all places where we do <work_mem_var> * 1024L to use this
new macro. Make the macro do the * 1024L as is done today so that this
patch is a simple refactor.
Patch 0002: Convert all places that use long and use Size instead.
Adjust WORK_MEM_KB_TO_BYTES to use a Size type rather than 1024L.

It might be wise to break 0002 down into individual GUCs as the patch
might become large.

I suspect we might have quite a large number of subtle bugs in our
code today due to using longs. 7340d9362 is an example of one that was
fixed recently.

David



Re: Increase of maintenance_work_mem limit in 64-bit Windows

От
Vladlen Popolitov
Дата:
David Rowley писал(а) 2024-09-23 04:28:
> On Fri, 20 Sept 2024 at 01:55, Пополитов Владлен
> <v.popolitov@postgrespro.ru> wrote:
>> Currently PostgreSQL built on 64-bit Windows has 2Gb limit for
>> GUC variables due to sizeof(long)==4 used by Windows compilers.
>> Technically 64-bit addressing for maintenance_work_mem is possible,
>> but code base historically uses variables and constants of type 
>> "long",
>> when process maintenance_work_mem value.
> 
> I agree. Ideally, we shouldn't use longs for anything ever. We should
> likely adopt trying to remove the usages of them when possible.
> 
> I'd like to suggest you go about this patch slightly differently with
> the end goal of removing the limitation from maintenance_work_mem,
> work_mem, autovacuum_work_mem and logical_decoding_work_mem.
> 
> Patch 0001: Add a macro named something like WORK_MEM_KB_TO_BYTES()
> and adjust all places where we do <work_mem_var> * 1024L to use this
> new macro. Make the macro do the * 1024L as is done today so that this
> patch is a simple refactor.
> Patch 0002: Convert all places that use long and use Size instead.
> Adjust WORK_MEM_KB_TO_BYTES to use a Size type rather than 1024L.
> 
> It might be wise to break 0002 down into individual GUCs as the patch
> might become large.
> 
> I suspect we might have quite a large number of subtle bugs in our
> code today due to using longs. 7340d9362 is an example of one that was
> fixed recently.
> 
> David

Hi David,
Thank you for proposal, I looked at the patch and source code from this
point of view. In this approach we need to change all <work_mem_var>.
I counted the appearences of these vars in the code:
maintenance_work_mem appears 63 times in 20 files
work_mem appears 113 times in 48 files
logical_decoding_work_mem appears 10 times in 2 files
max_stack_depth appears 11 times in 3 files
wal_keep_size_mb appears 5 times in 3 files
min_wal_size_mb appears 5 times in 2 files
max_wal_size_mb appears 10 times in 2 files
wal_skip_threshold appears 5 times in 2 files
max_slot_wal_keep_size_mb appears 6 times in 3 files
wal_sender_timeout appears 23 times in 3 files
autovacuum_work_mem appears 11 times in 4 files
gin_pending_list_limit appears 8 times in 5 files
pendingListCleanupSize appears 2 times in 2 files
GinGetPendingListCleanupSize appears 2 times in 2 files

maintenance_work_mem appears 63 times and had only 4 cases, where "long"
is used (I fix it in patch). I also found, that this patch also fixed
autovacuum_work_mem , that has only 1 case - the same place in code as
maintenance_work_mem.

Now <work_mem_vars> in the code are processed based on the context: they 
are
assigned to Size, uint64, int64, double, long, int variables (last 2 
cases
need to fix) or multiplied by (uint64)1024, (Size)1024, 1024L (last case
needs to fix). Also signed value is used for max_stack_depth (-1 used as
error value). I am not sure, that we can solve all this cases by one
macro WORK_MEM_KB_TO_BYTES(). The code needs case by case check.

If I check the rest of the variables, the patch does not need
MAX_SIZE_T_KILOBYTES constant (I introduced it for variables, that are
already checked and fixed), it will contain only fixes in the types of
the variables and the constants.
It requires a lot of time to check all appearances and neighbour
code, but final patch will not be large, I do not expect a lot of
"long" in the rest of the code (only 4 case out of 63 needed to fix
for maintenance_work_mem).
What do you think about this approach?

-- 
Best regards,

Vladlen Popolitov.



Re: Increase of maintenance_work_mem limit in 64-bit Windows

От
David Rowley
Дата:
On Mon, 23 Sept 2024 at 21:01, Vladlen Popolitov
<v.popolitov@postgrespro.ru> wrote:
> Thank you for proposal, I looked at the patch and source code from this
> point of view. In this approach we need to change all <work_mem_var>.
> I counted the appearences of these vars in the code:
> maintenance_work_mem appears 63 times in 20 files
> work_mem appears 113 times in 48 files
> logical_decoding_work_mem appears 10 times in 2 files
> max_stack_depth appears 11 times in 3 files
> wal_keep_size_mb appears 5 times in 3 files
> min_wal_size_mb appears 5 times in 2 files
> max_wal_size_mb appears 10 times in 2 files
> wal_skip_threshold appears 5 times in 2 files
> max_slot_wal_keep_size_mb appears 6 times in 3 files
> wal_sender_timeout appears 23 times in 3 files
> autovacuum_work_mem appears 11 times in 4 files
> gin_pending_list_limit appears 8 times in 5 files
> pendingListCleanupSize appears 2 times in 2 files
> GinGetPendingListCleanupSize appears 2 times in 2 files

Why do you think all of these appearances matter? I imagined all you
care about are when the values are multiplied by 1024.

> If I check the rest of the variables, the patch does not need
> MAX_SIZE_T_KILOBYTES constant (I introduced it for variables, that are
> already checked and fixed), it will contain only fixes in the types of
> the variables and the constants.
> It requires a lot of time to check all appearances and neighbour
> code, but final patch will not be large, I do not expect a lot of
> "long" in the rest of the code (only 4 case out of 63 needed to fix
> for maintenance_work_mem).
> What do you think about this approach?

I don't think you can do maintenance_work_mem without fixing work_mem
too. I don't think the hacks you've put into RI_Initial_Check() to
ensure you don't try to set work_mem beyond its allowed range are very
good. It effectively means that maintenance_work_mem does not do what
it's meant to for the initial validation of referential integrity
checks. If you're not planning on fixing work_mem too, would you just
propose to leave those hacks in there forever?

David



Re: Increase of maintenance_work_mem limit in 64-bit Windows

От
Vladlen Popolitov
Дата:
David Rowley писал(а) 2024-09-23 15:35:
> On Mon, 23 Sept 2024 at 21:01, Vladlen Popolitov
> <v.popolitov@postgrespro.ru> wrote:
>> Thank you for proposal, I looked at the patch and source code from 
>> this
>> point of view. In this approach we need to change all <work_mem_var>.
>> I counted the appearences of these vars in the code:
>> maintenance_work_mem appears 63 times in 20 files
>> work_mem appears 113 times in 48 files
>> logical_decoding_work_mem appears 10 times in 2 files
>> max_stack_depth appears 11 times in 3 files
>> wal_keep_size_mb appears 5 times in 3 files
>> min_wal_size_mb appears 5 times in 2 files
>> max_wal_size_mb appears 10 times in 2 files
>> wal_skip_threshold appears 5 times in 2 files
>> max_slot_wal_keep_size_mb appears 6 times in 3 files
>> wal_sender_timeout appears 23 times in 3 files
>> autovacuum_work_mem appears 11 times in 4 files
>> gin_pending_list_limit appears 8 times in 5 files
>> pendingListCleanupSize appears 2 times in 2 files
>> GinGetPendingListCleanupSize appears 2 times in 2 files
> 
> Why do you think all of these appearances matter? I imagined all you
> care about are when the values are multiplied by 1024.
Common pattern in code - assign <work_mem_var> to local variable and 
send
local variable as parameter to function, then to nested function, and
somewhere deep multiply function parameter by 1024. It is why I needed 
to
check all appearances, most of them are correct.
>> If I check the rest of the variables, the patch does not need
>> MAX_SIZE_T_KILOBYTES constant (I introduced it for variables, that are
>> already checked and fixed), it will contain only fixes in the types of
>> the variables and the constants.
>> It requires a lot of time to check all appearances and neighbour
>> code, but final patch will not be large, I do not expect a lot of
>> "long" in the rest of the code (only 4 case out of 63 needed to fix
>> for maintenance_work_mem).
>> What do you think about this approach?
> 
> I don't think you can do maintenance_work_mem without fixing work_mem
> too. I don't think the hacks you've put into RI_Initial_Check() to
> ensure you don't try to set work_mem beyond its allowed range are very
> good. It effectively means that maintenance_work_mem does not do what
> it's meant to for the initial validation of referential integrity
> checks. If you're not planning on fixing work_mem too, would you just
> propose to leave those hacks in there forever?
I agree, it is better to fix all them together. I also do not like this
hack, it will be removed from the patch, if I check and change
all <work_mem_vars> at once.
I think, it will take about 1 week to fix and test all changes. I will
estimate the total volume of the changes and think, how to group them
in the patch ( I hope, it will be only one patch)

-- 
Best regards,

Vladlen Popolitov.



Re: Increase of maintenance_work_mem limit in 64-bit Windows

От
David Rowley
Дата:
On Tue, 24 Sept 2024 at 02:47, Vladlen Popolitov
<v.popolitov@postgrespro.ru> wrote:
> I agree, it is better to fix all them together. I also do not like this
> hack, it will be removed from the patch, if I check and change
> all <work_mem_vars> at once.
> I think, it will take about 1 week to fix and test all changes. I will
> estimate the total volume of the changes and think, how to group them
> in the patch ( I hope, it will be only one patch)

There's a few places that do this:

Size maxBlockSize = ALLOCSET_DEFAULT_MAXSIZE;

/* choose the maxBlockSize to be no larger than 1/16 of work_mem */
while (16 * maxBlockSize > work_mem * 1024L)

I think since maxBlockSize is a Size variable, that the above should
probably be:

while (16 * maxBlockSize > (Size) work_mem * 1024)

Maybe there can be a precursor patch to fix all those to get rid of
the 'L' and cast to the type we're comparing to or assigning to rather
than trying to keep the result of the multiplication as a long.

David



Re: Increase of maintenance_work_mem limit in 64-bit Windows

От
Vladlen Popolitov
Дата:
David Rowley писал(а) 2024-09-24 01:07:
> On Tue, 24 Sept 2024 at 02:47, Vladlen Popolitov
> <v.popolitov@postgrespro.ru> wrote:
>> I agree, it is better to fix all them together. I also do not like 
>> this
>> hack, it will be removed from the patch, if I check and change
>> all <work_mem_vars> at once.
>> I think, it will take about 1 week to fix and test all changes. I will
>> estimate the total volume of the changes and think, how to group them
>> in the patch ( I hope, it will be only one patch)
> 
> There's a few places that do this:
> 
> Size maxBlockSize = ALLOCSET_DEFAULT_MAXSIZE;
> 
> /* choose the maxBlockSize to be no larger than 1/16 of work_mem */
> while (16 * maxBlockSize > work_mem * 1024L)
> 
> I think since maxBlockSize is a Size variable, that the above should
> probably be:
> 
> while (16 * maxBlockSize > (Size) work_mem * 1024)
> 
> Maybe there can be a precursor patch to fix all those to get rid of
> the 'L' and cast to the type we're comparing to or assigning to rather
> than trying to keep the result of the multiplication as a long.
Yes. It is what I mean, when I wrote about the context - in this case
variable is used in "Size" context and the cast to Size type should be
used. It is why I need to check all places in code. I am going to do it
during this week.

-- 
Best regards,

Vladlen Popolitov.



Re: Increase of maintenance_work_mem limit in 64-bit Windows

От
Vladlen Popolitov
Дата:
v.popolitov@postgrespro.ru писал(а) 2024-10-01 00:30:
> David Rowley писал(а) 2024-09-24 01:07:
>> On Tue, 24 Sept 2024 at 02:47, Vladlen Popolitov
>> <v.popolitov@postgrespro.ru> wrote:
>>> I agree, it is better to fix all them together. I also do not like 
>>> this
>>> hack, it will be removed from the patch, if I check and change
>>> all <work_mem_vars> at once.
>>> I think, it will take about 1 week to fix and test all changes. I 
>>> will
>>> estimate the total volume of the changes and think, how to group them
>>> in the patch ( I hope, it will be only one patch)
>> 
>> There's a few places that do this:
>> 
>> Size maxBlockSize = ALLOCSET_DEFAULT_MAXSIZE;
>> 
>> /* choose the maxBlockSize to be no larger than 1/16 of work_mem */
>> while (16 * maxBlockSize > work_mem * 1024L)
>> 
>> I think since maxBlockSize is a Size variable, that the above should
>> probably be:
>> 
>> while (16 * maxBlockSize > (Size) work_mem * 1024)
>> 
>> Maybe there can be a precursor patch to fix all those to get rid of
>> the 'L' and cast to the type we're comparing to or assigning to rather
>> than trying to keep the result of the multiplication as a long.
> 
> Hi
> 
> I rechecked all <work_mem_vars>, that depend on MAX_KILOBYTES limit and 
> fixed
> all casts that are affected by 4-bytes long type in Windows 64-bit. Now
> next variables are limited by 2TB in all 64-bit systems:
> maintenance_work_mem
> work_mem
> logical_decoding_work_mem
> max_stack_depth
> autovacuum_work_mem
> gin_pending_list_limit
> wal_skip_threshold
> Also wal_keep_size_mb, min_wal_size_mb, max_wal_size_mb,
> max_slot_wal_keep_size_mb are not affected by "long" cast.

Hi everyone.

The patch added to Commitfest:
https://commitfest.postgresql.org/50/5343/
-- 
Best regards,

Vladlen Popolitov.



Re: Increase of maintenance_work_mem limit in 64-bit Windows

От
Tom Lane
Дата:
v.popolitov@postgrespro.ru writes:
> [ v2-0001-work_mem_vars-limit-increased-in-64bit-Windows.patch ]

I took a brief look at this.  I think it's generally going in the
right direction, but you seem to be all over the place on how
you are doing the casts:

+    if (metadata->nPendingPages * GIN_PAGE_FREESIZE > cleanupSize * (Size)1024L)

+    scanEntry->matchBitmap = tbm_create(work_mem * INT64CONST(1024), NULL);

+    dead_items_info->max_bytes = vac_work_mem * (size_t)1024;

Is there a reason not to do them all the same way?  I'd suggest
"(Size) 1024", losing the "L" which has no purpose any more.
(And note project style is with a space after the cast.)
I don't like use of INT64CONST here because that forces an
unnecessarily expensive calculation in 32-bit machines.

I wonder if it'd be worth creating a macro rather than repeating
"* (Size) 1024" everywhere.  KILOBYTES_TO_BYTES(work_mem) seems
too wordy, but maybe we can think of a shorter name.

-    long        sort_threshold;
+    uint64        sort_threshold;

(1) Are you sure the related code doesn't need this to be signed?
(2) Even if it can be unsigned, why not Size or size_t?

-tbm_create(long maxbytes, dsa_area *dsa)
+tbm_create(double maxbytes, dsa_area *dsa)

Why "double"??

Also, do we need to widen the result of tbm_calculate_entries?
I see the clamp to INT_MAX-1, but should we try to get rid of
that?  (Or if not, should we reduce its result to "int"?)

-    long        sort_mem_bytes = sort_mem * 1024L;
+    int64        sort_mem_bytes = sort_mem * INT64CONST(1024);

Wrong type surely, and even more so here:

-    long        work_mem_bytes = work_mem * 1024L;
+    double        work_mem_bytes = work_mem * INT64CONST(1024);

If there's actually a reason for this scattershot approach to
new data types, you need to explain what the plan is.  I'd
have expected a push to replace "long" with "Size", or maybe
use size_t (or ssize_t when we need a signed type).
 
 /* upper limit for GUC variables measured in kilobytes of memory */
 /* note that various places assume the byte size fits in a "long" variable */
-#if SIZEOF_SIZE_T > 4 && SIZEOF_LONG > 4
+#if SIZEOF_SIZE_T > 4
 #define MAX_KILOBYTES    INT_MAX
 #else
 #define MAX_KILOBYTES    (INT_MAX / 1024)

This is pretty much the crux of the whole thing, and you didn't
fix/remove the comment you falsified.

            regards, tom lane



Re: Increase of maintenance_work_mem limit in 64-bit Windows

От
Tom Lane
Дата:
Vladlen Popolitov <v.popolitov@postgrespro.ru> writes:
> Only one exceptions in src/backend/commands/vacuumparallel.c:378
> shared->dead_items_info.max_bytes = vac_work_mem * (size_t)1024;
> max_bytes declared as size_t, I did cast to the same type (I cannot
> guarranty, that size_t and Size are equivalent in all systems)

I direct your attention to c.h:

/*
 * Size
 *        Size of any memory resident object, as returned by sizeof.
 */
typedef size_t Size;

It used to be possible for them to be different types, but not
anymore.  But I take your point that using "size_t" instead
makes sense if that's how the target variable is declared.
(And I feel no need to replace size_t with Size or vice versa
in code that's already correct.)

>> I wonder if it'd be worth creating a macro rather than repeating
>> "* (Size) 1024" everywhere.  KILOBYTES_TO_BYTES(work_mem) seems
>> too wordy, but maybe we can think of a shorter name.

I agree that this wasn't such a great idea, mainly because we did
not end up using exactly "Size" everywhere.

> 1) sort_mem_bytes is used as int64 variable later, passed to
> function with int64 parameter. "Size" type is not used in
> this case.
> 2) work_mem_bytes is compared only with double values later, it
> does not need additional casts in this case. "Size" type is not
> used in this case.

Agreed on these.

> Agree. It was "double" due to usage of this variable as parameter
> in tbm_calculate_entries(double maxbytes), but really
> tbm_calculate_entries() does not need this type, it again
> converted to "long" local variable. I changed parameter,
> local variable and return value of tbm_calculate_entries()
> to Size.

Agreed on the parameter being Size, but I made the return type
be int to emphasize that we're restricting the result to
integer range.  (It looks like additional work would be needed
to let it exceed INT_MAX; if anyone ever feels like doing that
work, they can change the result type again.)

> Also tbm_calculate_entries() is used in assignment to "long"
> variable maxentries.

I concluded there that we should make maxentries "double",
on the same reasoning as you have above: it's only used in
calculations with other double variables.

Pushed with some minor additional cleanups.  Many thanks for
working on this!  It was a TODO for ages.

            regards, tom lane