Обсуждение: MinMaxAggPath vs. parallel-safety
On Sun, Jun 26, 2016 at 10:35 PM, Noah Misch <noah@leadboat.com> wrote: >> I looked into this and found that the costs are considered fuzzily the >> same, and then add_path prefers the slightly-worse path on the grounds >> that it is marked parallel_safe while the MinMaxAgg path is not. It seems >> to me that there is some fuzzy thinking going on there. On exactly what >> grounds is a path to be preferred merely because it is parallel safe, and >> not actually parallelized? Or perhaps the question to ask is whether a >> MinMaxAgg path can be marked parallel-safe. > > [Action required within 72 hours. This is a generic notification.] > > The above-described topic is currently a PostgreSQL 9.6 open item ("consider > whether MinMaxAggPath might fail to be parallel-safe"). Robert, since you > committed the patch believed to have created it, you own this open item. If > some other commit is more relevant or if this does not belong as a 9.6 open > item, please let us know. Otherwise, please observe the policy on open item > ownership[1] and send a status update within 72 hours of this message. > Include a date for your subsequent status update. Testers may discover new > open items at any time, and I want to plan to get them all fixed well in > advance of shipping 9.6rc1. Consequently, I will appreciate your efforts > toward speedy resolution. Thanks. It turns out that this open item is phased incorrectly. I'll update the phrasing. /* A MinMaxAggPath implies use of subplans, so cannot be parallel-safe */ pathnode->path.parallel_safe = false; Currently, MinMaxAggPath is never parallel-safe; the question is whether we could allow it to be parallel-safe (not, as the current phrasing implies, whether it might ever need to be other than parallel-safe). It appears to me that the answer is "no", because a MinMaxAggPath contains a list of MinMaxAggInfo objects, and there we have this: Param *param; /* param for subplan's output */ Since subplans aren't passed down to parallel workers, no MinMaxAggPath can be parallel-safe. Therefore, I think there's nothing to do here right now. Comments? See also https://www.postgresql.org/message-id/CA+TgmoZ7wvMPMTSNtk+dfDUNWmc8kK5pUtLDnzOLvJ9DVeAF_A@mail.gmail.com (Official status update: I'll remove this open item in 3 days unless the above analysis is shown to be incorrect.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sun, Jun 26, 2016 at 10:35 PM, Noah Misch <noah@leadboat.com> wrote: >> The above-described topic is currently a PostgreSQL 9.6 open item ("consider >> whether MinMaxAggPath might fail to be parallel-safe"). > Currently, MinMaxAggPath is never parallel-safe; the question is > whether we could allow it to be parallel-safe (not, as the current > phrasing implies, whether it might ever need to be other than > parallel-safe). Check. > It appears to me that the answer is "no", because a > MinMaxAggPath contains a list of MinMaxAggInfo objects, and there we > have this: > Param *param; /* param for subplan's output */ > Since subplans aren't passed down to parallel workers, no > MinMaxAggPath can be parallel-safe. Therefore, I think there's > nothing to do here right now. Comments? Hm. In principle, this could be made to work, since I don't think it would be necessary for the Param's value to pass across process boundaries. (It could be locally generated within a worker, and then also consumed within the worker, if the worker's plan looked like a Result with a subplan attached.) However, if we don't even pass down the plan trees for subplans, then I agree that it can't work at the moment. In any case, this is an optimization opportunity not a bug. If you want to kick this can down the road until parallel query is generally smarter about subplans, that's OK with me. regards, tom lane
On Mon, Jun 27, 2016 at 3:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Sun, Jun 26, 2016 at 10:35 PM, Noah Misch <noah@leadboat.com> wrote: >>> The above-described topic is currently a PostgreSQL 9.6 open item ("consider >>> whether MinMaxAggPath might fail to be parallel-safe"). > >> Currently, MinMaxAggPath is never parallel-safe; the question is >> whether we could allow it to be parallel-safe (not, as the current >> phrasing implies, whether it might ever need to be other than >> parallel-safe). > > Check. > >> It appears to me that the answer is "no", because a >> MinMaxAggPath contains a list of MinMaxAggInfo objects, and there we >> have this: >> Param *param; /* param for subplan's output */ >> Since subplans aren't passed down to parallel workers, no >> MinMaxAggPath can be parallel-safe. Therefore, I think there's >> nothing to do here right now. Comments? > > Hm. In principle, this could be made to work, since I don't think it > would be necessary for the Param's value to pass across process > boundaries. (It could be locally generated within a worker, and then also > consumed within the worker, if the worker's plan looked like a Result with > a subplan attached.) However, if we don't even pass down the plan trees > for subplans, then I agree that it can't work at the moment. We don't. See ExecSerializePlan(). > In any case, this is an optimization opportunity not a bug. If you want > to kick this can down the road until parallel query is generally smarter > about subplans, that's OK with me. I don't really see another option. I don't think it would be a lot of work to pass subplans to workers along with the main plan, but finding all of the places that can then benefit as a result of that change and figuring out which cases are allowable sounds to me like development work, not stabilization. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company