Re: query patterns for multipass aggregating

Поиск
Список
Период
Сортировка
От Andrew Gierth
Тема Re: query patterns for multipass aggregating
Дата
Msg-id 87d0rbwp4a.fsf@news-spur.riddles.org.uk
обсуждение исходный текст
Ответ на query patterns for multipass aggregating  (Rob Nikander <rob.nikander@gmail.com>)
Ответы Re: query patterns for multipass aggregating  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Список pgsql-general
>>>>> "Rob" == Rob Nikander <rob.nikander@gmail.com> writes:

 Rob> I want a query to list items, with their colors and images. Each
 Rob> result row is an item, so the colors and images must be aggregated
 Rob> into arrays or json.

 Rob> If there were only one other table, it’s a simple join and group…

 Rob>     select items.*, array_agg(color_name)
 Rob>     from items join colors on items.id = colors.item_id
 Rob>     group by items.id

Method 1:

select items.*, c.colors, i.images
  from items
       left join (select item_id, array_agg(color_name) as colors
                    from colors
                   group by item_id) c
         on c.item_id=items.id
       left join (select item_id, array_agg(image) as images
                    from images
                   group by item_id) i
         on i.item_id=items.id;

Method 2:

select items.*, c.colors, i.images
  from items
       left join lateral (select array_agg(c0.color_name) as colors
                            from colors c0
                           where c0.item_id=items.id) c
         on true
       left join lateral (select array_agg(i0.image) as images
                            from images i0
                           where i0.item_id=items.id) i
         on true;

Unfortunately, the planner isn't smart enough yet to know that these two
are equivalent, so they generate different sets of possible query plans.
Method 1 gets plans that work well if the entire items table is being
selected, since it will read the whole of the images and colors tables
in one go, and it will also get plans that work well for reading a
_single_ item selected by WHERE items.id=? because equivalence-class
processing will propagate a copy of that condition down to below the
grouping clauses. It will _not_ get a good plan for reading any other
small subset of items (e.g. selected by other criteria); for this you
need method 2, which in turn doesn't get very good plans when you fetch
the whole items table.

Don't be tempted to use CTEs for the subqueries in either plan; that'll
only make it much worse.

--
Andrew (irc:RhodiumToad)


В списке pgsql-general по дате отправления:

Предыдущее
От: "Peter J. Holzer"
Дата:
Сообщение: Re: Fwd: Log file
Следующее
От: Andrew Gierth
Дата:
Сообщение: Re: query patterns for multipass aggregating