Re: Cost overestimation of foreign JOIN
| От | Ashutosh Bapat |
|---|---|
| Тема | Re: Cost overestimation of foreign JOIN |
| Дата | |
| Msg-id | CAExHW5sQwaaUdesHK1p4VU_jzygbPDT0h=6C1xcEnOFX+HJLow@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: Cost overestimation of foreign JOIN (Andrey Lepikhov <a.lepikhov@postgrespro.ru>) |
| Ответы |
Re: Cost overestimation of foreign JOIN
|
| Список | pgsql-hackers |
On Mon, Nov 30, 2020 at 11:56 PM Andrey Lepikhov <a.lepikhov@postgrespro.ru> wrote: > > On 30.11.2020 22:38, Tom Lane wrote: > > Andrey Lepikhov <a.lepikhov@postgrespro.ru> writes: > >> Maybe it is needed to swap lines 2908 and 2909 (see attachment)? > > > > No; as explained in the comment immediately above here, we're assuming > > that the join conditions will be applied on the cross product of the > > input relations. > > Thank you. Now it is clear to me. > > > > Now admittedly, that's a worst-case assumption, since it amounts to > > expecting that the remote server will do the join in the dumbest > > possible nested-loop way. If the remote can use a merge or hash > > join, for example, the cost is likely to be a lot less. > > My goal is scaling Postgres on a set of the same servers with same > postgres instances. If one server uses for the join a hash-join node, i > think it is most likely that the other server will also use for this > join a hash-join node (Maybe you remember, I also use the statistics > copying technique to provide up-to-date statistics on partitions). Tests > show good results with such an approach. But maybe this is my special case. > > > But it is > > not the job of this code path to outguess the remote planner. It's > > certainly not appropriate to invent an unprincipled cost estimate > > as a substitute for trying to guess that. > > Agreed. > > > > If you're unhappy with the planning results you get for this, > > why don't you have use_remote_estimate turned on? > > I have a mixed load model. Large queries are suitable for additional > estimate queries. But for many simple SELECT's that touch a small > portion of the data, the latency has increased significantly. And I > don't know how to switch the use_remote_estimate setting in such case. You may disable use_remote_estimates for given table or a server. So if tables participating in short queries are different from those in the large queries, you could set use_remote_estimate at table level to turn it off for the first set. Otherwise, we need a FDW level GUC which can be turned on/off for a given session or a query. Generally use_remote_estimate isn't scalable and there have been discussions about eliminating the need of it. But no concrete proposal has come yet. -- Best Wishes, Ashutosh Bapat
В списке pgsql-hackers по дате отправления: