PG13 fails to startup even though the current transaction is equal to the target transaction

Поиск
Список
Период
Сортировка
От Sean Jezewski
Тема PG13 fails to startup even though the current transaction is equal to the target transaction
Дата
Msg-id CAMMYWxM1_6VNuGSO2ahfsrU28Jc5SMuPLNZUnpDDk9BSe0wL7w@mail.gmail.com
обсуждение исходный текст
Ответы Re: PG13 fails to startup even though the current transaction is equal to the target transaction  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Список pgsql-hackers
We've noticed what may be a regression / bug in PG13.

I work at Heroku on the Data team, where we manage a fleet of Postgres services. This change has resulted in breaking the UX we offer to customers to manage their PG services. In particular, ‘forks’ and ‘point in time restores’ seem broken for PG13.

I believe it is related to this patch (https://www.postgresql.org/message-id/993736dd3f1713ec1f63fc3b653839f5%40lako.no)

For PG12, we expect:

-- We create a new Postgres service from archive and provide a recovery_target_xid
-- PG replays the archive until the end of the archive is reached, and the current transaction == recovery_target_xid
-- We measure the current transaction via the query SELECT pg_catalog.txid_snapshot_xmax(pg_catalog.txid_current_snapshot())
-- Since the current transaction is exactly equal to the target transaction, we perform the promotion

For PG12, what we get:

-- This process completes smoothly, and the new postgres service is up and running

For PG13, we expect:

-- We create a new Postgres service from archive and provide a recovery_target_xid
-- PG replays the archive until the end of the archive is reached, and the current transaction == recovery_target_xid
-- We measure the current transaction via the query SELECT pg_catalog.txid_snapshot_xmax(pg_catalog.txid_current_snapshot())
-- Since the current transaction is exactly equal to the target transaction, we perform the promotion

For PG13, what we get:

-- On promotion we see the postgres process exit with the following log lines:

Mar 17 14:47:49 ip-10-0-146-54 25a9551c_65ec_4870_99e9_df69151984a0[7]: [18-1] sql_error_code = 00000 LOG: promote trigger file found: /etc/postgresql/wal-e.d/pull-env/STANDBY_OFF
Mar 17 14:47:49 ip-10-0-146-54 25a9551c_65ec_4870_99e9_df69151984a0[7]: [19-1] sql_error_code = 00000 LOG: redo done at 0/60527E0
Mar 17 14:47:49 ip-10-0-146-54 25a9551c_65ec_4870_99e9_df69151984a0[7]: [20-1] sql_error_code = 00000 LOG: last completed transaction was at log time 2021-03-17 14:42:44.901448+00
Mar 17 14:47:49 ip-10-0-146-54 25a9551c_65ec_4870_99e9_df69151984a0[7]: [21-1] sql_error_code = XX000 FATAL: recovery ended before configured recovery target was reached
Mar 17 14:47:49 ip-10-0-146-54 25a9551c_65ec_4870_99e9_df69151984a0[5]: [8-1] sql_error_code = 00000 LOG: startup process (PID 7) exited with exit code 1

Even though the transaction IDs are identical. It seems like the end of the archive was reached (in fact the last transaction), and while we arrived at the correct transaction id, somehow PG decided it wasn’t done replaying? Perhaps because somehow the timestamps don’t line up? (Afaict we do not set the recovery_target_time, just the recovery_target_xid)

We have the `recovery_target_inclusive` set to true, which is the default. It really seems like the intent of that setting means that if the target equals the current transaction ID, recovery should be marked as complete. However we're seeing the opposite. While the current txn id == the target transaction id, the server exits. This is surprising, and doesn't line up with our expected behavior.

We have a workaround.

Right before promotion, if we increment the transaction of the leader database (the original PG service that we're forking from) by running `SELECT pg_catalog.txid_current()`, wait 120 seconds (double our archive timeout value to allow for the WAL segment to be written / uploaded / read), and wait until the current transaction is strictly greater than the target transaction, then the promotion seems to work fine every time for PG13. But this seems like an off by one error?

What do you think? Is this a bug? Is this expected? Is this user error on our end?

Thanks!

Sean

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: GROUP BY DISTINCT
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: Maintaining a list of pgindent commits for "git blame" to ignore