Re: Detoasting optionally to make Explain-Analyze less misleading

Поиск
Список
Период
Сортировка
От Matthias van de Meent
Тема Re: Detoasting optionally to make Explain-Analyze less misleading
Дата
Msg-id CAEze2Wj=w+qDtPqM7qt-ZRXbG7vmvONa-ocfQZ1inGuE58zrRg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Detoasting optionally to make Explain-Analyze less misleading  (stepan rutz <stepan.rutz@gmx.de>)
Ответы Re: Detoasting optionally to make Explain-Analyze less misleading  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Mon, 26 Feb 2024 at 21:54, stepan rutz <stepan.rutz@gmx.de> wrote:
>
> Hi Matthias, thanks for picking it up. I still believe this is valuable
> to a lot of people out there. Thanks for dealing with my proposal.
> Matthias, Tom, Tomas everyone.
>
> Two (more or less) controversial remarks from side.
>
> 1. Actually serialization should be the default for "analyze" in
> explain, as current analyze doesn't detoast and thus distorts the result
> in extreme (but common) cases easily by many order of magnitude (see my
> original video on that one). So current "explain analyze" only works for
> some queries and since detoasting is really transparent, it is quite
> something to leave detoasting out of explain analyze. This surprises
> people all the time, since explain analyze suggests it "runs" the query
> more realistically.

I'm not sure about this, but it could easily be a mid-beta decision
(if this is introduced before the feature freeze of 17, whenever that
is).

> 2. The bandwidth I computed in one of the previous versions of the patch
> was certainly cluttering up the explain output and it is misleading yes,
> but then again it performs a calculation people will now do in their
> head. The "bandwidth" here is how much data your query gets out of
> backend by means of the query and the deserialization. So of course if
> you do id-lookups you get a single row and such querries do have a lower
> data-retrieval bandwidth compared to bulk querries.

I think that's a job for post-processing the EXPLAIN output by the
user. If we don't give users the raw data, they won't be able to do
their own cross-referenced processing: "5MB/sec" doesn't tell you how
much time or data was actually spent.

> However having some
> measure of how fast data is delivered from the backend especially on
> larger joins is still a good indicator of one aspect of a query.

I'm not sure about that. Network speed is a big limiting factor that
we can't measure here, and the size on disk is often going to be
smaller than the data size when transfered across the network.

> Sorry for the remarks. Both are not really important, just restating my
> points here. I understand the objections and reasons that speak against
> both points and believe the current scope is just right.

No problem. Remarks from users (when built on solid arguments) are
always welcome, even if we may not always agree on the specifics.

------>8------

Attached is v9, which is rebased on master to handle 24eebc65's
changed signature of pq_sendcountedtext.
It now also includes autocompletion, and a second patch which adds
documentation to give users insights into this new addition to
EXPLAIN.


Kind regards,

Matthias van de Meent

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: collect_corrupt_items_vacuum.patch
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Introduce XID age and inactive timeout based replication slot invalidation