Обсуждение: TOAST corruption in standby database

Поиск
Список
Период
Сортировка

TOAST corruption in standby database

От
Alex Adriaanse
Дата:
We have primary and hot standby databases running Postgres 11.3 inside Docker, with their data directories bind-mounted
toa reflink-enabled XFS filesystem. The VM is running Debian's 4.19.16-1~bpo9+1 kernel inside an AWS EC2 instance. 

I've seen TOAST corruption in one of the standby databases a few months ago in a ~44GB table, so I wiped the database
andrebuilt it using pg_basebackup, which eliminated the corruption. This week I've seen corruption pop up again in the
sametable in one of the standby databases. The other standby database experienced no corruption. 

The corrupted table has four columns of types integer, text, text, and jsonb. The corruption happens inside the jsonb
column.

The corruption manifests itself as follows in the standby database:

SELECT length(json::text) FROM <table> WHERE identity = '...';
ERROR:  missing chunk number 0 for toast value 64265646 in pg_toast_16103925

SELECT ctid, chunk_id, chunk_seq, md5(chunk_data) FROM pg_toast.pg_toast_16103925 WHERE chunk_id = 64265646;
 ctid | chunk_id | chunk_seq | md5
------+----------+-----------+-----
(0 rows)

SELECT count(1) FROM pg_toast.pg_toast_16103925 WHERE chunk_id = 64265646;
 count
-------
     2
(1 row)

Looking up the TOAST block that is supposed to contain this value you can see that the TOAST tuples are missing:

SELECT ctid, chunk_id, chunk_seq, md5(chunk_data) FROM pg_toast.pg_toast_16103925 WHERE ctid IN ('(1793121,1)',
'(1793121,2)','(1793121,3)', '(1793121,4)', '(1793121,5)', '(1793121,6)', '(1793121,7)'); 
    ctid     | chunk_id | chunk_seq |               md5
-------------+----------+-----------+----------------------------------
 (1793121,3) | 41259162 |         0 | 1bff36f306bac135cce9da44dd6d6bbb
 (1793121,4) | 41259162 |         1 | b754d688c5c847c7bc519e65741ffef1
 (1793121,5) | 41259163 |         0 | 10dfa4f5b3e32188f0b4b28c9be76abe
 (1793121,6) | 41259163 |         1 | 7dceb98b2c2f4ac3c72245c58c85323f
(4 rows)

For comparison here are the same queries against the primary database:

SELECT length(json::text) FROM <table> WHERE identity = '...';
 length
--------
   7817
(1 row)

SELECT ctid, chunk_id, chunk_seq, md5(chunk_data) FROM pg_toast.pg_toast_16103925 WHERE chunk_id = 64265646;
    ctid     | chunk_id | chunk_seq |               md5
-------------+----------+-----------+----------------------------------
 (1793121,1) | 64265646 |         0 | a9a2642e8408fc178fb809b86c430997
 (1793121,2) | 64265646 |         1 | 371bc2628ac5bfc8b37d32f93d08fefe
(2 rows)

SELECT count(1) FROM pg_toast.pg_toast_16103925 WHERE chunk_id = 64265646;
 count
-------
     2
(1 row)

SELECT ctid, chunk_id, chunk_seq, md5(chunk_data) FROM pg_toast.pg_toast_16103925 WHERE ctid IN ('(1793121,1)',
'(1793121,2)','(1793121,3)', '(1793121,4)', '(1793121,5)', '(1793121,6)', '(1793121,7)'); 
    ctid     | chunk_id | chunk_seq |               md5
-------------+----------+-----------+----------------------------------
 (1793121,1) | 64265646 |         0 | a9a2642e8408fc178fb809b86c430997
 (1793121,2) | 64265646 |         1 | 371bc2628ac5bfc8b37d32f93d08fefe
 (1793121,3) | 41259162 |         0 | 1bff36f306bac135cce9da44dd6d6bbb
 (1793121,4) | 41259162 |         1 | b754d688c5c847c7bc519e65741ffef1
 (1793121,5) | 41259163 |         0 | 10dfa4f5b3e32188f0b4b28c9be76abe
 (1793121,6) | 41259163 |         1 | 7dceb98b2c2f4ac3c72245c58c85323f
(6 rows)

Looking at the data file for the TOAST relation, the header data structures in the relevant block seem fine to me,
whichmakes me think this is not caused by filesystem corruption (unless a write silently failed). The second half of
thatblock is identical between the primary and corrupted standby, but in the first half the corrupted standby is
missingdata. 

Standby (corrupted):

# dd if=data/base/18034/16103928.13 bs=8192 skip=89185 count=1 status=none | hexdump -C | head -8
00000000  a3 0e 00 00 48 46 88 0e  00 00 05 00 30 00 58 0f  |....HF......0.X.|
00000010  00 20 04 20 00 00 00 00  00 00 00 00 00 00 00 00  |. . ............|
00000020  10 98 e0 0f 98 97 e8 00  a8 8f e0 0f 58 8f 96 00  |............X...|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000f50  00 00 00 00 00 00 00 00  32 b0 0a 01 00 00 00 00  |........2.......|
00000f60  00 00 00 00 1b 00 61 5c  06 00 03 00 02 09 18 00  |......a\........|
00000f70  9b 90 75 02 01 00 00 00  ac 00 00 00 83 9f 64 00  |..u...........d.|

Primary:

# dd if=data/base/18034/16103928.13 bs=8192 skip=89185 count=1 status=none | hexdump -C | head -8
00000000  bd 0e 00 00 08 ad 32 b7  00 00 05 00 30 00 90 04  |......2.....0...|
00000010  00 20 04 20 00 00 00 00  68 87 e0 0f 90 84 a8 05  |. . ....h.......|
00000020  10 98 e0 0f 98 97 e8 00  a8 8f e0 0f 58 8f 96 00  |............X...|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000490  a6 07 7e 02 00 00 00 00  00 00 00 00 1b 00 61 5c  |..~...........a\|
000004a0  02 00 03 00 02 09 18 00  ae 9d d4 03 01 00 00 00  |................|
000004b0  d0 0a 00 00 23 25 10 07  88 02 13 0f 2c 04 78 01  |....#%......,.x.|

Based on the above observations it seems to me that occasionally some of the changes aren't replicating to or
persistingby the standby database. In the past I've seen some TCP packets get mangled or dropped between our EC2
instances,leading to sudden disconnects. The standby connects to the primary using SSL (sslmode=require
sslcompression=1)so I would think if there's any network-level corruption SSL would catch it, causing the connection to
failand reconnect. Outside of any SSL disconnects (which don't happen often), this database is stopped and restarted
twicea week so we can clone it (using cp -a --reflink=always). 

Any ideas on what might be causing this?

Thanks,

Alex


Re: TOAST corruption in standby database

От
Amit Kapila
Дата:
On Fri, Oct 25, 2019 at 1:50 AM Alex Adriaanse <alex@oseberg.io> wrote:
>
> Standby (corrupted):
>
> # dd if=data/base/18034/16103928.13 bs=8192 skip=89185 count=1 status=none | hexdump -C | head -8
> 00000000  a3 0e 00 00 48 46 88 0e  00 00 05 00 30 00 58 0f  |....HF......0.X.|
> 00000010  00 20 04 20 00 00 00 00  00 00 00 00 00 00 00 00  |. . ............|
> 00000020  10 98 e0 0f 98 97 e8 00  a8 8f e0 0f 58 8f 96 00  |............X...|
> 00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 00000f50  00 00 00 00 00 00 00 00  32 b0 0a 01 00 00 00 00  |........2.......|
> 00000f60  00 00 00 00 1b 00 61 5c  06 00 03 00 02 09 18 00  |......a\........|
> 00000f70  9b 90 75 02 01 00 00 00  ac 00 00 00 83 9f 64 00  |..u...........d.|
>
> Primary:
>
> # dd if=data/base/18034/16103928.13 bs=8192 skip=89185 count=1 status=none | hexdump -C | head -8
> 00000000  bd 0e 00 00 08 ad 32 b7  00 00 05 00 30 00 90 04  |......2.....0...|
> 00000010  00 20 04 20 00 00 00 00  68 87 e0 0f 90 84 a8 05  |. . ....h.......|
> 00000020  10 98 e0 0f 98 97 e8 00  a8 8f e0 0f 58 8f 96 00  |............X...|
> 00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 00000490  a6 07 7e 02 00 00 00 00  00 00 00 00 1b 00 61 5c  |..~...........a\|
> 000004a0  02 00 03 00 02 09 18 00  ae 9d d4 03 01 00 00 00  |................|
> 000004b0  d0 0a 00 00 23 25 10 07  88 02 13 0f 2c 04 78 01  |....#%......,.x.|
>
> Based on the above observations it seems to me that occasionally some of the changes aren't replicating to or
persistingby the standby database.
 
>

I am not sure what is the best way to detect this, but one idea could
be to enable wal_consistency_checking [1].  This will at the very
least can detect if the block is replicated correctly for the very
first time.  Also, if there is some corruption issue on standby, you
might be able to detect.  But the point to note is that enabling this
option has overhead, so you need to be careful.


[1] - https://www.postgresql.org/docs/devel/runtime-config-developer.html
-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: TOAST corruption in standby database

От
"postgresql_2016@163.com"
Дата:
SELECT ctid, chunk_id, chunk_seq, md5(chunk_data) FROM
pg_toast.pg_toast_16103925 WHERE chunk_id = 64265646;
 ctid | chunk_id | chunk_seq | md5
------+----------+-----------+-----
(0 rows)

SELECT count(1) FROM pg_toast.pg_toast_16103925 WHERE chunk_id = 64265646;
 count
-------
     2
(1 row)

From the aboving query,I think the problem is form the index. First query
use the default toast index (chunk_id, chunk_seq) to search, it's found
none. The second query use the seq scan and find two matched rows. So I
think the index store value maybe not match the heap tuple. If have the
corruption toast data, I think we can use the tool to see the index and heap
tuple data.

typedef struct IndexTupleData
{
    ItemPointerData t_tid;        /* reference TID to heap tuple */

    /* ---------------
     * t_info is laid out in the following fashion:
     *
     * 15th (high) bit: has nulls
     * 14th bit: has var-width attributes
     * 13th bit: AM-defined meaning
     * 12-0 bit: size of tuple
     * ---------------
     */

    unsigned short t_info;        /* various info about tuple */

} IndexTupleData;                /* MORE DATA FOLLOWS AT END OF STRUCT */

In my env, I encounter many cases like "missing chunk number 0 for toast
value XXX in pg_toast_2619" after shutdown the service forcely without
stopping the database. The 2619 is the pg_statistic which is update
frequently.



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html