Re: cost_sort() improvements

Поиск
Список
Период
Сортировка
От Teodor Sigaev
Тема Re: cost_sort() improvements
Дата
Msg-id ce8eff53-52f2-e7e6-0059-8527c3f2892d@sigaev.ru
обсуждение исходный текст
Ответ на Re: cost_sort() improvements  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
> OK, so Fi is pretty much whatever CREATE FUNCTION ... COST says, right?
exactly

> Hmm, makes sense. But doesn't that mean it's mostly a fixed per-tuple
> cost, not directly related to the comparison? For example, why should it
> be multiplied by C0? That is, if I create a very expensive comparator
> (say, with cost 100), why should it increase the cost for transferring
> the tuple to CPU cache, unpacking it, etc.?
> 
> I'd say those costs are rather independent of the function cost, and
> remain rather fixed, no matter what the function cost is.
> 
> Perhaps you haven't noticed that, because the default funcCost is 1?
May be, but see my email 
https://www.postgresql.org/message-id/ee14392b-d753-10ce-f5ed-7b2f7e277512%40sigaev.ru 
about additional term proportional to N

> The number of new magic constants introduced by this patch is somewhat
> annoying. 2.0, 1.5, 0.125, ... :-(
2.0 is removed in last patch, 1.5 leaved and could be removed when I understand 
you letter with group size estimation :)
0.125 should be checked, and I suppose we couldn't remove it at all because it 
"average over whole word" constant.

> 
>>   - Final cost is cpu_operator_cost * N * sum(per column costs described
>> above).
>>     Note, for single column with width <= sizeof(datum) and F1 = 1 this
>> formula
>>     gives exactly the same result as current one.
>>   - for Top-N sort empiric is close to old one: use 2.0 multiplier as
>> constant
>>     under log2, and use log2(Min(NGi, output_tuples)) for second and
>> following
>>     columns.
>>
> 
> I think compute_cpu_sort_cost is somewhat confused whether
> per_tuple_cost is directly a cost, or a coefficient that will be
> multiplied with cpu_operator_cost to get the actual cost.
> 
> At the beginning it does this:
> 
>      per_tuple_cost = comparison_cost;
> 
> so it inherits the value passed to cost_sort(), which is supposed to be
> cost. But then it does the work, which includes things like this:
> 
>      per_tuple_cost += 2.0 * funcCost * LOG2(tuples);
> 
> where funcCost is pretty much pg_proc.procost. AFAIK that's meant to be
> a value in units of cpu_operator_cost. And at the end it does this
> 
>      per_tuple_cost *= cpu_operator_cost;
> 
> I.e. it gets multiplied with another cost. That doesn't seem right.

Huh, you are right, will fix in v8.


> Also, why do we need this?
> 
>      if (sortop != InvalidOid)
>      {
>          Oid funcOid = get_opcode(sortop);
> 
>          funcCost = get_func_cost(funcOid);
>      }
Safety first :). Will remove.
-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: _isnan() on Windows
Следующее
От: Teodor Sigaev
Дата:
Сообщение: Re: cost_sort() improvements