Re: 12.3 replicas falling over during WAL redo
От | Ben Chobot |
---|---|
Тема | Re: 12.3 replicas falling over during WAL redo |
Дата | |
Msg-id | 9ba2cbd8-fa0b-cdd9-3eea-26b5418f20ce@silentmedia.com обсуждение исходный текст |
Ответ на | Re: 12.3 replicas falling over during WAL redo (Alvaro Herrera <alvherre@2ndquadrant.com>) |
Ответы |
Re: 12.3 replicas falling over during WAL redo
|
Список | pgsql-general |
Alvaro Herrera wrote on 8/3/20 2:34 PM:
If I use skip instead of seek....
lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
--------------+----------+-------+-------+-------+---------+----------+---------+-----------
A0A/99BA11F8 | -215 | 0 | 180 | 7240 | 8176 | 8192 | 4 | 0
As I understand what we're looking at, this means the WAL stream was assuming this page was last touched by A0A/AB2C43D0, but the page itself thinks it was last touched by A0A/99BA11F8, which means at least one write to the page is missing?
On 2020-Aug-03, Ben Chobot wrote:Alvaro Herrera wrote on 8/3/20 12:34 PM:On 2020-Aug-03, Ben Chobot wrote: Yep. Looking at the ones in block 6501,rmgr: Btree len (rec/tot): 72/ 72, tx: 76393394, lsn: A0A/AB2C43D0, prev A0A/AB2C4378, desc: INSERT_LEAF off 41, blkref #0: rel 16605/16613/60529051 blk 6501 rmgr: Btree len (rec/tot): 72/ 72, tx: 76396065, lsn: A0A/AC4204A0, prev A0A/AC420450, desc: INSERT_LEAF off 48, blkref #0: rel 16605/16613/60529051 blk 6501My question was whether the block has received the update that added the item in offset 41; that is, is the LSN in the crashed copy of the page equal to A0A/AB2C43D0? If it's an older value, then the write above was lost for some reason.How do I tell?You can use pageinspect's page_header() function to obtain the page's LSN. You can use dd to obtain the page from the file, dd if=16605/16613/60529051 bs=8192 count=1 seek=6501 of=/tmp/page.6501
If I use skip instead of seek....
then put that binary file in a bytea column, perhaps like create table page (raw bytea); insert into page select pg_read_binary_file('/tmp/page'); and with that you can run page_header: create extension pageinspect; select h.* from page, page_header(raw) h;
lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
--------------+----------+-------+-------+-------+---------+----------+---------+-----------
A0A/99BA11F8 | -215 | 0 | 180 | 7240 | 8176 | 8192 | 4 | 0
As I understand what we're looking at, this means the WAL stream was assuming this page was last touched by A0A/AB2C43D0, but the page itself thinks it was last touched by A0A/99BA11F8, which means at least one write to the page is missing?
В списке pgsql-general по дате отправления: