Re: Bunch o' dead code in GEQO

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: Bunch o' dead code in GEQO
Дата	22 января 2004 г. 12:52:29
Msg-id	12735.1074790315@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: Bunch o' dead code in GEQO ("scott.marlowe" <scott.marlowe@ihs.com>)
Список	pgsql-hackers

Дерево обсуждения

"scott.marlowe" <scott.marlowe@ihs.com> writes:
> On Thu, 22 Jan 2004, Tom Lane wrote:
>> I'm assuming that the original author of the GEQO code already did that
>> testing ...

> Hmmm.  I was figuring he wasn't sure so he left them in for other people 
> to test.  Is this a part of the code that eats up much time, or something 
> simple and fast that isn't part of the "GEQO takes 8 seconds to plan" 
> problem?

Well, the basic plan of the GEQO code is

Step 1: generate a bunch of possible join paths at random.

Step 2: randomly select a pair of paths from the current population,
generate a new path that is some combination of these, and push it back
into the population, dropping the worst path from the population.
Repeat for a bunch of generations.

Step 3: take the best path in the final population.

The different recombination algorithms simply are different ways of
generating a "child" path given two "parent" paths in step 2.  Changing
them wouldn't affect the runtime noticeably at all --- the primary cost
is in evaluating each generated path, which is why the runtime is 
approximately the sum of the population size (step 1) and the number
of generations (step 2).  Possibly a different recombiner would give a
better chance of finding a good plan, but I'm unconvinced.  Arguably the
recombiners that are there are all wrong anyway, since they were all
invented to solve Traveling Salesman problems, which this is not quite.

The only way we can do much about the runtime is to reduce the default
population size.  With the current default parameters, a large query
will have population size 1024 and 400 generations, so about two-thirds
of the runtime is in generating the initial random population; if we
can't make a dent in that then we're not going to gain much.  The
question is what will this do to the average quality of the selected
plans.  One thing I'm thinking about is trying to improve the quality of
the initial population by immediately discarding any really bad plans
(for instance, use a heuristic that pays attention to which relations
are linked by WHERE clauses).
        regards, tom lane

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Bunch o' dead code in GEQO