Re: pgsql: postgres_fdw: reestablish new connection if cached one is detect
От | Tom Lane |
---|---|
Тема | Re: pgsql: postgres_fdw: reestablish new connection if cached one is detect |
Дата | |
Msg-id | 2943611.1602375376@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: pgsql: postgres_fdw: reestablish new connection if cached one is detect (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: pgsql: postgres_fdw: reestablish new connection if cached one is detect
|
Список | pgsql-committers |
Fujii Masao <masao.fujii@oss.nttdata.com> writes: >>> Therefore, the easy fix is to make libpq mark the connection as >>> CONNECTION_BAD even in ECONNABORTED, like we do in ECONNRESET. So in the wake of commit fe27009cb, this is what lorikeet is doing: @@ -9028,9 +9028,7 @@ CALL terminate_backend_and_wait('fdw_retry_check'); SAVEPOINT s; SELECT 1 FROM ft1 LIMIT 1; -- should fail -ERROR: server closed the connection unexpectedly - This probably means the server terminated abnormally - before or while processing the request. +ERROR: could not receive data from server: Software caused connection abort CONTEXT: remote SQL command: SAVEPOINT s2 COMMIT; -- Clean up which is better --- the connection recovery is working --- but obviously still not a "pass". The only way to make this test pass as-is is to hack libpq so that it deems ECONNABORTED to be a server crash, which frankly I do not think is a good change. I don't see any principled reason to think that it means that rather than a network connectivity failure. What I've done instead as a stopgap is to adjust the test case to be insensitive to the exact error message text. Maybe there's a better way, but I can't think of anything besides having two (or more?) expected-output files. That would be quite unpalatable as things stand, though perhaps we could make it tolerable by splitting this test off into a second test script. Meanwhile, now that I've looked at commit 32a9c0bdf, I'm not very happy with it: * The control flow seems rather forced. I think it was designed on the assumption that reindenting the existing code is forbidden. Why not use an actual loop, instead of a goto? I also think that it's far from obvious that the loop isn't an infinite loop; it took me quite a while to glom onto the fact that the test inside the PG_CATCH avoids that by checking retry_conn. Maybe a comment would be enough to improve that, but I feel the control logic could stand a rethink. * The CATCH case is completely failing to clean up after itself. At the minimum, there has to be a FlushErrorState() there. I don't think it'd be terribly hard to drive this code to an error-stack-overflow PANIC. * As is so often true of proposed patches in which PG_CATCH is thought to be an adequate way to catch an error, this is just unbelievably fragile. It will break, and badly so, if it catches an error that is anything other than what it is expecting ... and it's not even particularly trying to verify that the error is what it's expecting. It might be okay, or at least a little bit harder to break, if it verified that the error's SQLSTATE is ERRCODE_CONNECTION_FAILURE. regards, tom lane
В списке pgsql-committers по дате отправления: