BUG #11264: Auto vacuum wraparound job blocking everything

Поиск
Список
Период
Сортировка
От dbhandary@switchfly.com
Тема BUG #11264: Auto vacuum wraparound job blocking everything
Дата
Msg-id 20140825225029.2536.5315@wrigleys.postgresql.org
обсуждение исходный текст
Ответы Re: BUG #11264: Auto vacuum wraparound job blocking everything  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: BUG #11264: Auto vacuum wraparound job blocking everything  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      11264
Logged by:          Dines Bhandary
Email address:      dbhandary@switchfly.com
PostgreSQL version: 9.3.5
Operating system:   Centos 6.5
Description:

We recently upgraded to postgressql 9.3.5  from postgres 9.1 using
pg_upgrade. We are running streaming replication. DB was functioning without
any issues till today where we discovered an auto vacuum wraparound job on
one of our tables which was blocking everything. There were no waiting jobs
when we queried pg_stat_activity, but incoming request would just hang, so
database was unusable.

We tried to disable auto vacuum for a table in question and tried kill auto
vacuum job, but that did not work. Also we had pg_dump running for several
hours for that particular table where auto vacuum was running. We tried
killing pg_dump job, but it did not succeed.

We eventually had to force shutdown the database and disable auto vacuum and
increase vacuum_freeze parameter so that auto vacuum wraparound job won't
trigger until we figure out what the issue is.

I am including debug trace here. It seems like auto vacuum is running in an
infinite loop. Will be happy to provide more info if necessary.

(gdb) bt
#0  0x00007fb9740eb5b3 in __select_nocancel () from /lib64/libc.so.6
#1  0x000000000075ad7a in pg_usleep ()
#2  0x00000000004ae2d4 in GetMultiXactIdMembers ()
#3  0x000000000047f6cc in heap_prepare_freeze_tuple ()
#4  0x000000000057d036 in lazy_vacuum_rel ()
#5  0x000000000057b346 in ?? ()
#6  0x000000000057b67c in vacuum ()
#7  0x000000000060cf8a in ?? ()
#8  0x000000000060d426 in ?? ()
#9  0x000000000060d4f6 in StartAutoVacWorker ()
#10 0x0000000000617d42 in ?? ()
#11 <signal handler called>
#12 0x00007fb9740eb5b3 in __select_nocancel () from /lib64/libc.so.6
#13 0x00000000006191fd in PostmasterMain ()
#14 0x00000000005b5e60 in main ()
(gdb)

 16423 | jetblue | 21972 |       10 | postgres |                  |
   |                 |             | 2014-08-24 18:39:38.134514+00 |
2014-08-24 18:39:38.161612+00 | 2014-08-24 18:39:38.161612+00 | 2014-08-24
18:39:38.161614+00 | f       | active | autovacuum: VACUUM
settings.room_contract_service_code (to prevent wraparound)

 COPY:

jetblue=# select * from pg_stat_activity where query ilike '%copy%';
 datid | datname |  pid  | usesysid | usename | application_name |
client_addr | client_hostname | client_port |         backend_start
|          xact_start           |          query_start          |
state_change          | waiting | state  |
                                               query


-------+---------+-------+----------+---------+------------------+-------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+---------+--------+--------------------------------------------------------------------------------------------------------
------------------------------------------------------------------
 16423 | jetblue | 30246 |    20350 | m84     | pg_dump          |
  |                 |          -1 | 2014-08-25 03:48:59.807689+00 |
2014-08-25 03:48:59.810871+00 | 2014-08-25 04:26:14.95753+00  | 2014-08-25
04:26:14.957534+00 | f       | active | COPY
settings.room_contract_service_code (room_contract_service_code,
room_contract_service_description
, room_contract_service_code_id, supplierid, inactive) TO stdout;
 16423 | jetblue | 31999 |    20350 | m84     |                  |
10.33.11.11 |                 |       40735 | 2014-08-25 10:38:02.944083+00
| 2014-08-25 10:38:02.964562+00 | 2014-08-25 10:38:22.805859+00 | 2014-08-25
10:38:22.805861+00 | f       | active | COPY
settings.room_contract_service_code (room_contract_service_code,
room_contract_service_description
, room_contract_service_code_id, supplierid, inactive) TO stdout;

 (gdb) bt
#0  0x00007fb9740f5187 in semop () from /lib64/libc.so.6
#1  0x0000000000609397 in PGSemaphoreLock ()
#2  0x000000000064e871 in LWLockAcquire ()
#3  0x000000000047e1a9 in ?? ()
#4  0x000000000047e66a in ?? ()
#5  0x000000000047f0a6 in heap_getnext ()
#6  0x00000000005343e6 in ?? ()
#7  0x000000000053521a in DoCopy ()
#8  0x000000000066112d in standard_ProcessUtility ()
#9  0x00007fb96c4df261 in ?? () from
/usr/pgsql-9.3/lib/pg_stat_statements.so
#10 0x000000000065db97 in ?? ()
#11 0x000000000065eb2d in ?? ()
#12 0x000000000065f223 in PortalRun ()
#13 0x000000000065b91e in ?? ()
#14 0x000000000065d048 in PostgresMain ()
#15 0x0000000000619c36 in PostmasterMain ()
#16 0x00000000005b5e60 in main ()


SELECTS:

datid            | 16423
datname          | jetblue
pid              | 15599
usesysid         | 20354
usename          | developers
application_name |
client_addr      | 10.33.11.11
client_hostname  |
client_port      | 60741
backend_start    | 2014-08-25 20:20:38.985486+00
xact_start       | 2014-08-25 20:39:13.426937+00
query_start      | 2014-08-25 20:39:13.497362+00
state_change     | 2014-08-25 20:39:13.497365+00
waiting          | f
state            | active
query            |
                 |         SELECT room_contract_service_code,
room_contract_service_description, room_contract_service_code_id
                 |         FROM room_contract_service_code
                 |         WHERE room_contract_service_code.supplierid = $1
                 |                 AND NOT inactive
                 |         ORDER by room_contract_service_description
(gdb) bt
#0  0x00007fb9740f5187 in semop () from /lib64/libc.so.6
#1  0x0000000000609397 in PGSemaphoreLock ()
#2  0x000000000064e871 in LWLockAcquire ()
#3  0x000000000048d05d in index_fetch_heap ()
#4  0x000000000048d22e in index_getnext ()
#5  0x0000000000595e96 in ?? ()
#6  0x000000000058aabe in ExecScan ()
#7  0x0000000000583c08 in ExecProcNode ()
#8  0x000000000059c4e9 in ExecSort ()
#9  0x0000000000583b28 in ExecProcNode ()
#10 0x0000000000582a42 in standard_ExecutorRun ()
#11 0x00007fb96c4df4db in ?? () from
/usr/pgsql-9.3/lib/pg_stat_statements.so
#12 0x000000000065df67 in ?? ()
#13 0x000000000065f191 in PortalRun ()
#14 0x000000000065d3ed in PostgresMain ()
#15 0x0000000000619c36 in PostmasterMain ()
#16 0x00000000005b5e60 in main ()
(gdb)

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Matheus de Oliveira
Дата:
Сообщение: Re: Postgresql concern of effect of invalid index
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: BUG #11264: Auto vacuum wraparound job blocking everything