Re: PostgreSQL 17 Beta 1 release announcement draft
От | Jonathan S. Katz |
---|---|
Тема | Re: PostgreSQL 17 Beta 1 release announcement draft |
Дата | |
Msg-id | 07de231a-bd4f-478b-a4af-501cefda2dae@postgresql.org обсуждение исходный текст |
Ответ на | Re: PostgreSQL 17 Beta 1 release announcement draft (John Naylor <johncnaylorls@gmail.com>) |
Список | pgsql-hackers |
On 5/21/24 6:40 AM, John Naylor wrote: > On Mon, May 20, 2024 at 8:41 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: >> >> On Mon, May 20, 2024 at 8:47 PM Jonathan S. Katz <jkatz@postgresql.org> wrote: >>> >>> On 5/20/24 2:58 AM, John Naylor wrote: >>>> Hi Jon, >>>> >>>> Regarding vacuum "has shown up to a 6x improvement in overall time to >>>> complete its work" -- I believe I've seen reported numbers close to >>>> that only 1) when measuring the index phase in isolation or maybe 2) >>>> the entire vacuum of unlogged tables with one, perfectly-correlated >>>> index (testing has less variance with WAL out of the picture). I >>>> believe tables with many indexes would show a lot of improvement, but >>>> I'm not aware of testing that case specifically. Can you clarify where >>>> 6x came from? >>> >>> Sawada-san showed me the original context, but I can't rapidly find it >>> in the thread. Sawada-san, can you please share the numbers behind this? >>> >> >> I referenced the numbers that I measured during the development[1] >> (test scripts are here[2]). IIRC I used unlogged tables and indexes, >> and these numbers were the entire vacuum execution time including heap >> scanning, index vacuuming and heap vacuuming. > > Thanks for confirming. > > The wording "has a new internal data structure that reduces memory > usage and has shown up to a 6x improvement in overall time to complete > its work" is specific for runtime, and the memory use is less > specific. Unlogged tables are not the norm, so I'd be cautious of > reporting numbers specifically designed (for testing) to isolate the > thing that changed. > > I'm wondering if it might be both more impressive-sounding and more > realistic for the average user experience to reverse that: specific on > memory, and less specific on speed. The best-case memory reduction > occurs for table update patterns that are highly localized, such as > the most recently inserted records, and I'd say those are a lot more > common than the use of unlogged tables. > > Maybe something like "has a new internal data structure that reduces > overall time to complete its work and can use up to 20x less memory." > > Now, it is true that when dead tuples are sparse and evenly spaced > (e.g. 1 every 100 pages), vacuum can now actually use more memory than > v16. However, the nature of that scenario also means that the number > of TIDs just can't get very big to begin with. In contrast, while the > runtime improvement for normal (logged) tables is likely not > earth-shattering, I believe it will always be at least somewhat > faster, and never slower. Thanks for the feedback. I flipped it around, per your suggestion: "has a new internal data structure that has shown up to a 20x memory reduction for vacuum, along with improvements in overall time to complete its work." Thanks, Jonathan
Вложения
В списке pgsql-hackers по дате отправления: