Re: Avoiding bad prepared-statement plans.

Поиск
Список
Период
Сортировка
От Mark Mielke
Тема Re: Avoiding bad prepared-statement plans.
Дата
Msg-id 4B882E5D.9080209@mark.mielke.cc
обсуждение исходный текст
Ответ на Re: Avoiding bad prepared-statement plans.  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Avoiding bad prepared-statement plans.  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On 02/26/2010 02:57 PM, Tom Lane wrote:
> Mark Mielke<mark@mark.mielke.cc>  writes:
>    
>> There must be some way to lift the cost of planning out of the plan
>> enumeration and selection phase, such that only plan enumeration and
>> selection is run at execute time. In most cases, plan enumeration and
>> selection, provided that all data required to make these decisions is
>> all cached in data structures ready to go, should be very fast? Right?
>>      
> Huh?  What exactly do you think the cost of planning is, if not
> enumeration and selection?  There isn't very much that's cacheable,
> at least not in any sanely-sized cache.
>    

I think most operations, including this one, can be broken into a fixed 
portion and a dynamic portion. The PREPARE should concern itself only 
with the fixed portion, and should leave the dynamic portion to EXECUTE. 
At present, the "planning process" is one big blob.

Here are parts that can be done "fixed":

1) Statement parsing and error checking.
2) Identification of tables and columns involved in the query.
3) Query the column statistics for involved columns, to be used in plan 
cost estimation now and later.
4) Determine plan constraints under which elements of the plan must be 
executed a certain way (something like constant folding for a compiler), 
or for which parameter substitution would not impact the outcome.
5) Identify the elements of the plan that still require plan enumeration 
and plan selection, to be used in a later part of the pipeline.

At a minimum, I am suggesting that 1), 2), and 3) should take a chunk 
out of the planning process. I think 4) and 5) are more complex but 
still valuable in terms of extracting the fixed portion out of the 
planning process.

I think an assumption is being made that the planning process is an 
atomic unit that cannot be turned into a pipeline or assembly line. I 
think this assumption was what originally tied PREPARE = PLAN, and 
EXECUTE = RUN. I think this assumption is leading to the conclusion that 
EXECUTE should re-plan. I also expect that this assumption is tightly 
woven into the current implementation and changing it would require some 
amount of re-architecture. :-)

>> By "not worth it", do you mean development effort or run time?
>>      
> Run time.  The development cost of what you are proposing is negligible:
> just rip out the plan cache altogether.  I don't believe it would be a
> performance win though.
>    

That's not my proposal, though. I'm suspecting you didn't read it. :-)

I'm fine with you saying "too hard and not worth my development effort" 
after you read it. I agree it would be a lot of work.

But if the conclusion is that the current architecture is the best that 
can be had, and the decision is only about when to do a custom re-plan 
or when to use the generic plan, I am putting my opinion out there that 
the generic plan has always been a compromise, and it will always be a 
compromise, and that this discussion exists primarily because the 
compromise is not adequate in many real world scenarios.

And that all said, I think I am challenging the status quo and ticking 
people off. So while my intent is to challenge the status quo, it is not 
to tick people off. So, please let me know if you would like me to 
continue, or if you have already written this off. :-)

Cheers,
mark



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Aidan Van Dyk
Дата:
Сообщение: Re: Re: Hot Standby query cancellation and Streaming Replication integration
Следующее
От: Yeb Havinga
Дата:
Сообщение: C libpq frontend library fetchsize