Re: compute_query_id and pg_stat_statements

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: compute_query_id and pg_stat_statements
Дата
Msg-id 20210512.183724.859740020819002744.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на compute_query_id and pg_stat_statements  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Список pgsql-hackers
At Wed, 12 May 2021 14:05:16 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in 
> On Wed, May 12, 2021 at 02:33:35PM +0900, Kyotaro Horiguchi wrote:
> > At Wed, 12 May 2021 10:42:01 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in 
> > > 
> > > I don't think that this approach would cope well for people who want a queryid
> > > without pg_stat_statements or such.  Since the queryid can now be found in
> > > pg_stat_activity, EXPLAIN output or the logs I think it's entirely reasonable
> > > to allow users to benefit from that even if they don't install additional
> > > module.
> > 
> > Ah, I missed that case.  And we are wanting to use pg_stat_statements
> > with (almost) zero-config?  How about the following behavior?
> > 
> > Setting query_id_provider to 'none' means we don't calculate query-id
> > by default. However, if queryIdWante() is called, the default provider
> > is set up and starts calculating query id.
> 
> Having "none" meant "not unless someone asks for it" looks like a POLA
> violation.

Sorry for confusion. A different behavior for "none" is proposed later
in the mail. It is just an intermediate discussion.

> > Setting query_id_provider to something else means the user wants
> > query-id calcualted using the provider. Setting 'default' is
> > equivalent to setting compute_query_id to 'on'.
> > 
> > There might be a case where a user sets query_id_provider to
> > non-'none' but don't want have query-id calculated, but it can be said
> > a kind of mis-configuration?
> 
> So if I'm understanding correctly, you're arguing for an approach different to
> what Michael stated as the general consensus in [1].  I'm not saying that I
> think it's a bad idea (and I actually suggested it before), but we have to
> chose an approach and stick with it.

I'm not sure how much room for change of the direction is left. So it
was just a proposal. So if the majority still thinks that it is the
way to stick to controling on/off/(auto) the in-core implement and
separately allow another module to be hooked, I don't further object
to that decision.

> > > I think this would be a mistake to do that, as it would mean that we don't
> > > officially support alternative queryid provider.
> > 
> > Ok, if we want to support alternative providers from the first, we
> > need to actually write the loader code for query-id providers.  It
> > would not be so hard?, but it might not be suitable to this stage so I
> > proposed that to get rid of needing such complexity for now.
> 
> I did write a POC extension [2] to demonstrate that moving pg_stat_statement's
> queryid calculation in core doesn't mean that we're imposing it to everyone.
> And yes this is critical and a must have in the initial implementation.

Ok, understood.

> > (Anyway I prefer to load query-id provider as a dynamically loadable
> >  module rather than hook-function.)
> 
> I agree that having a specific API (I'm fine with a hook or a dynamically
> loaded function) for that would be better, but it doesn't appear to be the
> opinion of the majority.

Ugg. Ok.

> > > The GUC itself may not change, but third-party queryid provider would probably
> > > need changes as the new entry point will be dedicated to compute a queryid
> > > only, while third-party plugins may do more than that in their
> > > post_parse_analyze_hook.  And also users will have to change their
> > 
> > I don't think it is not that a problem.
> 
> Did you mean "I don't think that it's a problem"?  Otherwise I don't get it.

Yes, you're right.  Sorry for the typo.

> > Even if any third-party
> > extension is having query-id generator by itself, in most cases it
> > would be a copy of JumbleQuery in case of pg_stat_statement is not
> > loaded and now it is moved in-core as 'default' provider. What the
> > exntension needs to be done is just ripping out the copied generator
> > code.  I guess...
> 
> I don't fully understand, but it seems that you're arguing that the only use
> case is to have something similar to pg_stat_statements (say e.g.
> pg_store_plans), that always have the same queryid implementation as
> pg_stat_statements.  That's not the case, as there already are "clones" of
> pg_stat_statements, and the main difference is an alternative queryid
> implementation.  So in that case what author would do is to drop everything
> *except* the queryid implementation.
> 
> And if I'm not mistaken, pg_store_plans also wants a different queryid
> implementation, but has to handle a secondary queryid on top of it
> (https://github.com/ossc-db/pg_store_plans/blob/master/pg_store_plans.c#L843-L855).

Yeah, the extension intended to be used joining with the
pg_stat_statements view. And the reason for the second query-id dates
back to the era when query id was not available in the
pg_stat_statements view. Now it is mere a fall-back query id when
pg_stat_statments is not active.  Now that the in-core query-id is
available, I think there's no reason to keep that implement.

> So here again what the extension want is to get rid of pg_stat_statements (and
> now core) queryid implementation.

So the extension might be a good reason for the discussion^^;

> > > configuration to use that new interface, and additionally the module may now
> > > have to be removed from shared_preload_libraries.  Overall, it doesn't seem to
> > > me that it would make users' life easier.
> > 
> > Why the third-party module need to be removed from
> > shared_preload_libraries?  The module can stay as a preloaded shared
> > library but just no longer need to have its own query-id provider
> > since it is provided in-core.  If the extension required a specific
> > provider, the developer need to make it a loadable module and users
> > need to specify the provider module explicitly.
> 
> It's the same misunderstanding here.  Basically people want to benefit from the
> whole ecosystem based on a queryid (pg_stat_statements, now
> pg_stat_activity.query_id and such) but with another definition of what a
> queryid is.  So those people will now only need to implement something like
> [2], rather than forking every single extension they want to use.

Hmm. I'm not sure the [2] gives sufficient reason for leaving the
current interface.  But will follow if it is sitll the consensus. (And
it seems like true.)

> [1]: https://www.postgresql.org/message-id/YJoeXcrwe1EDmqKT@paquier.xyz
> [2]: https://github.com/rjuju/pg_queryid

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Smith
Дата:
Сообщение: Re: GetSubscriptionRelations declares too many scan keys
Следующее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: compute_query_id and pg_stat_statements