Обсуждение: [MASSMAIL]enhance the efficiency of migrating particularly large tables
Hi Postgres hackers, I'm reaching out to gather some comments on enhancing the efficiency of migrating particularly large tables with significant data volumes in PostgreSQL. When migrating a particularly large table with a significant amount of data, users sometimes tend to split the table into multiple segments and utilize multiple sessions to process data from different segments in parallel, aiming to enhance efficiency. When segmenting a large table, it's challenging if the table lacks fields suitable for segmentation or if the data distribution is uneven. I believe that the data volume in each block should be relatively balanced when vacuum is enabled. Therefore, the ctid can be used to segment a large table, and I am thinking the entire process can be outlined as follows: 1) determine the minimum and maximum ctid. 2) calculate the number of data blocks based on the maximum and minimum ctid. 3) generate multiple SQL queries, such as SELECT * FROM tbl WHERE ctid >= '(xx,1)' AND ctid < '(xxx,1)'. However, when executing SELECT min(ctid) and max(ctid), it performs a Seq Scan, which can be slow for a large table. Is there a way to retrieve the minimum and maximum ctid other than using the system functions min() and max()? Since the minimum and maximum ctid are in order, theoretically, it should start searching from the first block and can stop as soon as it finds the first available one when retrieving the minimum ctid. Similarly, it should start searching in reverse order from the last block and stop upon finding the first occurrence when retrieving the maximum ctid. Here's a piece of code snippet: /* scan the relation for minimum or maximum ctid */ if (find_max_ctid) dir = BackwardScanDirection; else dir = ForwardScanDirection; while ((tuple = heap_getnext(scan, dir)) != NULL) ... The attached is a simple POC by referring to the extension pgstattuple. Any feedback, suggestions, or alternative solutions from the community would be greatly appreciated. Thank you, David
Вложения
On Tue, 9 Apr 2024 at 09:52, David Zhang <david.zhang@highgo.ca> wrote: > However, when executing SELECT min(ctid) and max(ctid), it performs a > Seq Scan, which can be slow for a large table. Is there a way to > retrieve the minimum and maximum ctid other than using the system > functions min() and max()? Finding the exact ctid seems overkill for what you need. Why you could just find the maximum block with: N = pg_relation_size('name_of_your_table'::regclass) / current_Setting('block_size')::int; and do WHERE ctid < '(N,1)'; If we wanted to optimise this in PostgreSQL, the way to do it would be, around set_plain_rel_pathlist(), check if the relation's ctid is a required PathKey by the same means as create_index_paths() does, then if found, create another seqscan path without synchronize_seqscans * and tag that with the ctid PathKey sending the scan direction according to the PathKey direction. nulls_first does not matter since ctid cannot be NULL. Min(ctid) query should be able to make use of this as the planner should rewrite those to subqueries with a ORDER BY ctid LIMIT 1. * We'd need to invent an actual Path type for SeqScanPath as I see create_seqscan_path() just uses the base struct Path. synchronize_seqscans would have to become a property of that new Path type and it would need to be carried forward into the plan and looked at in the executor so that we always start a scan at the first or last block. Unsure if such a feature is worthwhile. I think maybe not for just min(ctid)/max(ctid). However, there could be other reasons, such as the transform OR to UNION stuff that Tom worked on a few years ago. That needed to eliminate duplicate rows that matched both OR branches and that was done using ctid. David
David Rowley <dgrowleyml@gmail.com> writes: > Unsure if such a feature is worthwhile. I think maybe not for just > min(ctid)/max(ctid). However, there could be other reasons, such as > the transform OR to UNION stuff that Tom worked on a few years ago. > That needed to eliminate duplicate rows that matched both OR branches > and that was done using ctid. I'm kind of allergic to adding features that fundamentally depend on ctid, seeing that there's so much activity around non-heap table AMs that may not have any such concept, or may have a row ID that looks totally different. (That's one reason why I stopped working on that OR-to-UNION patch.) regards, tom lane
On Tue, 9 Apr 2024 at 11:02, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <dgrowleyml@gmail.com> writes: > > Unsure if such a feature is worthwhile. I think maybe not for just > > min(ctid)/max(ctid). However, there could be other reasons, such as > > the transform OR to UNION stuff that Tom worked on a few years ago. > > That needed to eliminate duplicate rows that matched both OR branches > > and that was done using ctid. > > I'm kind of allergic to adding features that fundamentally depend on > ctid, seeing that there's so much activity around non-heap table AMs > that may not have any such concept, or may have a row ID that looks > totally different. (That's one reason why I stopped working on that > OR-to-UNION patch.) I understand that point of view, however, I think if we were to maintain it as a policy that we'd likely miss out on various optimisations that future AMs could provide. When I pushed TID Range Scans a few years ago, I added "amflags" and we have AMFLAG_HAS_TID_RANGE so the planner can check the AM supports that before adding the Path. Anyway, I'm not saying let's do the non-sync scan SeqScanPath thing, I'm just saying that blocking optimisations as some future AM might not support it might mean we're missing out on some great speedups. David
Thanks a lot David Rowley for your suggestion in details.
On 2024-04-08 3:23 p.m., David Rowley wrote:
We experienced this approach using pg_relation_size and tried to compare the performance. Below are some simple timing results for 100 million records in a table:On Tue, 9 Apr 2024 at 09:52, David Zhang <david.zhang@highgo.ca> wrote:Finding the exact ctid seems overkill for what you need. Why you could just find the maximum block with: N = pg_relation_size('name_of_your_table'::regclass) / current_Setting('block_size')::int; and do WHERE ctid < '(N,1)';
Using system function max():
SELECT max(ctid) from t;
Time: 2126.680 ms (00:02.127)
Using pg_relation_size and where condition:
SELECT pg_relation_size('t'::regclass) / current_setting('block_size')::int;
Time: 0.561 ms
Using the experimental function introduced in previous patch:
SELECT ctid from get_ctid('t', 1);
Time: 0.452 ms
Delete about 1/3 records from the end of the table:
SELECT max(ctid) from t;
Time: 1552.975 ms (00:01.553)
SELECT pg_relation_size('t'::regclass) / current_setting('block_size')::int;
Time: 0.533 m
But before vacuum, pg_relation_size always return the same value as before and this relation_size may not be so accurate.
SELECT ctid from get_ctid('t', 1);
Time: 251.105 m
After vacuum:
SELECT ctid from get_ctid('t', 1);
Time: 0.478 ms
Below are the comparison between system function min() and the experimental function:
SELECT min(ctid) from t;
Time: 1932.554 ms (00:01.933)
SELECT ctid from get_ctid('t', 0);
Time: 0.478 ms
After deleted about 1/3 records from the beginning of the table:
SELECT min(ctid) from t;
Time: 1305.799 ms (00:01.306)
SELECT ctid from get_ctid('t', 0);
Time: 244.336 ms
After vacuum:
SELECT ctid from get_ctid('t', 0);
Time: 0.468 ms
If we wanted to optimise this in PostgreSQL, the way to do it would be, around set_plain_rel_pathlist(), check if the relation's ctid is a required PathKey by the same means as create_index_paths() does, then if found, create another seqscan path without synchronize_seqscans * and tag that with the ctid PathKey sending the scan direction according to the PathKey direction. nulls_first does not matter since ctid cannot be NULL. Min(ctid) query should be able to make use of this as the planner should rewrite those to subqueries with a ORDER BY ctid LIMIT 1.
Is there a simple way to get the min of ctid faster than using min(), but similar to get the max of ctid using pg_relation_size?
Thank you,
David Zhang
On Fri, 3 May 2024 at 09:33, David Zhang <david.zhang@highgo.ca> wrote: > Is there a simple way to get the min of ctid faster than using min(), but similar to get the max of ctid using pg_relation_size? The equivalent approximation is always '(0,1)'. David