Re: pg9.6 segfault using simple query (related to use fk for join estimates)
От | Tomas Vondra |
---|---|
Тема | Re: pg9.6 segfault using simple query (related to use fk for join estimates) |
Дата | |
Msg-id | 48919afc-1993-8ca8-5b42-3949e9166e92@2ndquadrant.com обсуждение исходный текст |
Ответ на | Re: pg9.6 segfault using simple query (related to use fk for join estimates) (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
Hi, On 05/04/2016 11:02 PM, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Wed, May 4, 2016 at 2:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> My other design-level complaint is that basing this on foreign keys is >>> fundamentally the wrong thing. What actually matters is the unique index >>> underlying the FK; that is, if we have "a.x = b.y" and there's a >>> compatible unique index on b.y, we can conclude that no A row will match >>> more than one B row, whether or not an explicit FK relationship has been >>> declared. So we should drive this off unique indexes instead of FKs, >>> first because we will find more cases, and second because the planner >>> already examines indexes and doesn't need any additional catalog lookups >>> to get the required data. (IOW, the relcache additions that were made in >>> this patch series should go away too.) > >> Without prejudice to anything else in this useful and detailed review, >> I have a question about this. A unique index proves that no A row >> will match more than one B row, and I agree that deriving that from >> unique indexes is sensible. However, ISTM that an FK provides >> additional information: we know that, modulo filter conditions on B, >> every A row will match *exactly* one row B row, which can prevent us >> from *underestimating* the size of the join product. A unique index >> can't do that. > > Very good point, but unless I'm missing something, that is not what the > current patch does. I'm not sure offhand whether that's an important > estimation failure mode currently, or if it is whether it would be > sensible to try to implement that rule entirely separately from the "at > most one" aspect, or if it isn't sensible, whether that's a sufficiently > strong reason to confine the "at most one" logic to working only with FKs > and not with bare unique indexes. FWIW it's a real-world problem with multi-column FKs. As David pointed out upthread, a nice example of this issue is Q9 in the TPC-H bench, where the underestimate leads to HashAggregate and then OOM failure. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: