Обсуждение: [MASSMAIL]enhance the efficiency of migrating particularly large tables

Поиск
Список
Период
Сортировка

[MASSMAIL]enhance the efficiency of migrating particularly large tables

От
David Zhang
Дата:
Hi Postgres hackers,

I'm reaching out to gather some comments on enhancing the efficiency of 
migrating particularly large tables with significant data volumes in 
PostgreSQL.

When migrating a particularly large table with a significant amount of 
data, users sometimes tend to split the table into multiple segments and 
utilize multiple sessions to process data from different segments in 
parallel, aiming to enhance efficiency. When segmenting a large table, 
it's challenging if the table lacks fields suitable for segmentation or 
if the data distribution is uneven. I believe that the data volume in 
each block should be relatively balanced when vacuum is enabled. 
Therefore, the ctid can be used to segment a large table, and I am 
thinking the entire process can be outlined as follows:
1) determine the minimum and maximum ctid.
2) calculate the number of data blocks based on the maximum and minimum 
ctid.
3) generate multiple SQL queries, such as SELECT * FROM tbl WHERE ctid 
 >= '(xx,1)' AND ctid < '(xxx,1)'.

However, when executing SELECT min(ctid) and max(ctid), it performs a 
Seq Scan, which can be slow for a large table. Is there a way to 
retrieve the minimum and maximum ctid other than using the system 
functions min() and max()?

Since the minimum and maximum ctid are in order, theoretically, it 
should start searching from the first block and can stop as soon as it 
finds the first available one when retrieving the minimum ctid. 
Similarly, it should start searching in reverse order from the last 
block and stop upon finding the first occurrence when retrieving the 
maximum ctid. Here's a piece of code snippet:

         /* scan the relation for minimum or maximum ctid */
         if (find_max_ctid)
             dir = BackwardScanDirection;
         else
             dir = ForwardScanDirection;

         while ((tuple = heap_getnext(scan, dir)) != NULL)
         ...

The attached is a simple POC by referring to the extension pgstattuple. 
Any feedback, suggestions, or alternative solutions from the community 
would be greatly appreciated.

Thank you,

David


Вложения

Re: enhance the efficiency of migrating particularly large tables

От
David Rowley
Дата:
On Tue, 9 Apr 2024 at 09:52, David Zhang <david.zhang@highgo.ca> wrote:
> However, when executing SELECT min(ctid) and max(ctid), it performs a
> Seq Scan, which can be slow for a large table. Is there a way to
> retrieve the minimum and maximum ctid other than using the system
> functions min() and max()?

Finding the exact ctid seems overkill for what you need.  Why you
could just find the maximum block with:

N = pg_relation_size('name_of_your_table'::regclass) /
current_Setting('block_size')::int;

and do WHERE ctid < '(N,1)';

If we wanted to optimise this in PostgreSQL, the way to do it would
be, around set_plain_rel_pathlist(), check if the relation's ctid is a
required PathKey by the same means as create_index_paths() does, then
if found, create another seqscan path without synchronize_seqscans *
and tag that with the ctid PathKey sending the scan direction
according to the PathKey direction. nulls_first does not matter since
ctid cannot be NULL.

Min(ctid) query should be able to make use of this as the planner
should rewrite those to subqueries with a ORDER BY ctid LIMIT 1.

* We'd need to invent an actual Path type for SeqScanPath as I see
create_seqscan_path() just uses the base struct Path.
synchronize_seqscans would have to become a property of that new Path
type and it would need to be carried forward into the plan and looked
at in the executor so that we always start a scan at the first or last
block.

Unsure if such a feature is worthwhile. I think maybe not for just
min(ctid)/max(ctid). However, there could be other reasons, such as
the transform OR to UNION stuff that Tom worked on a few years ago.
That needed to eliminate duplicate rows that matched both OR branches
and that was done using ctid.

David



Re: enhance the efficiency of migrating particularly large tables

От
Tom Lane
Дата:
David Rowley <dgrowleyml@gmail.com> writes:
> Unsure if such a feature is worthwhile. I think maybe not for just
> min(ctid)/max(ctid). However, there could be other reasons, such as
> the transform OR to UNION stuff that Tom worked on a few years ago.
> That needed to eliminate duplicate rows that matched both OR branches
> and that was done using ctid.

I'm kind of allergic to adding features that fundamentally depend on
ctid, seeing that there's so much activity around non-heap table AMs
that may not have any such concept, or may have a row ID that looks
totally different.  (That's one reason why I stopped working on that
OR-to-UNION patch.)

            regards, tom lane



Re: enhance the efficiency of migrating particularly large tables

От
David Rowley
Дата:
On Tue, 9 Apr 2024 at 11:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > Unsure if such a feature is worthwhile. I think maybe not for just
> > min(ctid)/max(ctid). However, there could be other reasons, such as
> > the transform OR to UNION stuff that Tom worked on a few years ago.
> > That needed to eliminate duplicate rows that matched both OR branches
> > and that was done using ctid.
>
> I'm kind of allergic to adding features that fundamentally depend on
> ctid, seeing that there's so much activity around non-heap table AMs
> that may not have any such concept, or may have a row ID that looks
> totally different.  (That's one reason why I stopped working on that
> OR-to-UNION patch.)

I understand that point of view, however, I think if we were to
maintain it as a policy that we'd likely miss out on various
optimisations that future AMs could provide.

When I pushed TID Range Scans a few years ago, I added "amflags" and
we have AMFLAG_HAS_TID_RANGE so the planner can check the AM supports
that before adding the Path.

Anyway, I'm not saying let's do the non-sync scan SeqScanPath thing,
I'm just saying that blocking optimisations as some future AM might
not support it might mean we're missing out on some great speedups.

David



Re: enhance the efficiency of migrating particularly large tables

От
David Zhang
Дата:

Thanks a lot David Rowley for your suggestion in details.

On 2024-04-08 3:23 p.m., David Rowley wrote:
On Tue, 9 Apr 2024 at 09:52, David Zhang <david.zhang@highgo.ca> wrote:
Finding the exact ctid seems overkill for what you need.  Why you
could just find the maximum block with:

N = pg_relation_size('name_of_your_table'::regclass) /
current_Setting('block_size')::int;

and do WHERE ctid < '(N,1)';
We experienced this approach using pg_relation_size and tried to compare the performance. Below are some simple timing results for 100 million records in a table:

Using system function max():
SELECT max(ctid) from t;
Time: 2126.680 ms (00:02.127)

Using pg_relation_size and where condition:
SELECT pg_relation_size('t'::regclass) / current_setting('block_size')::int;
Time: 0.561 ms

Using the experimental function introduced in previous patch:
SELECT ctid from get_ctid('t', 1);
Time: 0.452 ms


Delete about 1/3 records from the end of the table:
SELECT max(ctid) from t;
Time: 1552.975 ms (00:01.553)

SELECT pg_relation_size('t'::regclass) / current_setting('block_size')::int;
Time: 0.533 m
But before vacuum, pg_relation_size always return the same value as before and this relation_size may not be so accurate.

SELECT ctid from get_ctid('t', 1);
Time: 251.105 m

After vacuum:
SELECT ctid from get_ctid('t', 1);
Time: 0.478 ms


Below are the comparison between system function min() and the experimental function:

SELECT min(ctid) from t;
Time: 1932.554 ms (00:01.933)

SELECT ctid from get_ctid('t', 0);
Time: 0.478 ms

After deleted about 1/3 records from the beginning of the table:
SELECT min(ctid) from t;
Time: 1305.799 ms (00:01.306)

SELECT ctid from get_ctid('t', 0);
Time: 244.336 ms

After vacuum:
SELECT ctid from get_ctid('t', 0);
Time: 0.468 ms

If we wanted to optimise this in PostgreSQL, the way to do it would
be, around set_plain_rel_pathlist(), check if the relation's ctid is a
required PathKey by the same means as create_index_paths() does, then
if found, create another seqscan path without synchronize_seqscans *
and tag that with the ctid PathKey sending the scan direction
according to the PathKey direction. nulls_first does not matter since
ctid cannot be NULL.

Min(ctid) query should be able to make use of this as the planner
should rewrite those to subqueries with a ORDER BY ctid LIMIT 1.

Is there a simple way to get the min of ctid faster than using min(), but similar to get the max of ctid using pg_relation_size?


Thank you,

David Zhang

Re: enhance the efficiency of migrating particularly large tables

От
David Rowley
Дата:
On Fri, 3 May 2024 at 09:33, David Zhang <david.zhang@highgo.ca> wrote:
> Is there a simple way to get the min of ctid faster than using min(), but similar to get the max of ctid using
pg_relation_size?

The equivalent approximation is always '(0,1)'.

David