Re: Wrong cost estimation for foreign tables join with use_remote_estimatedisabled
От | Etsuro Fujita |
---|---|
Тема | Re: Wrong cost estimation for foreign tables join with use_remote_estimatedisabled |
Дата | |
Msg-id | 5B2CBE49.8070006@lab.ntt.co.jp обсуждение исходный текст |
Ответ на | Re: Wrong cost estimation for foreign tables join withuse_remote_estimate disabled (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>) |
Ответы |
Re: Wrong cost estimation for foreign tables join withuse_remote_estimate disabled
|
Список | pgsql-hackers |
Hi Konstantin, (2018/06/22 15:26), Konstantin Knizhnik wrote: > On 21.06.2018 20:08, Tom Lane wrote: >> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>> The following very simple test reduce the problem with wrong cost >>> estimation: >>> create foreign table t1_fdw(x integer, y integer) server pg_fdw options >>> (table_name 't1', use_remote_estimate 'false'); >>> create foreign table t2_fdw(x integer) server pg_fdw options (table_name >>> 't2', use_remote_estimate 'false'); >>> It is possible to force Postgres to use correct plan by setting >>> "fdw_startup_cost" to some very large value (100000000 for example). >>> ... >>> Also correct plan is used when use_remote_estimate is true. >> If you are unhappy about the results with use_remote_estimate off, don't >> run it that way. The optimizer does not have a crystal ball. > > As I wrote, use_remote_estimate can not be used because in this case > query compilation time is unacceptable (10 seconds, while time of query > execution itself is ~200msec). > So the problem can be addressed in two ways: > > 1. Try to reduce time of remote estimation. I wonder why postgres_fdw > sends so much queries to remote server. For join of two tables there are > 7 queries. > I suspect that for ~20 joined tables in the original query number of > calls is more than hundred, so on wonder that it takes so much time. > 2. Try to make optimizer make better estimation of join cost based on > local statistic (please notice that ANALYZE is explicitly called for all > foreign tables and number of rows in the result was correctly calculated). To make local estimates more accurate, I think we need other information on remote tables such as remote indexes. > What do you think: which of this two direction is more perspective? Or > it is better to address both of them? I'd vote for #2. One idea for that is to introduce CREATE FOREIGN INDEX to have information on remote indexes on the local side, which I proposed before. I have been putting it on hold since then, though. Best regards, Etsuro Fujita
В списке pgsql-hackers по дате отправления: