Обсуждение: BUG #15570: Vacuum analyze ERROR: MultiXactId XXXX has not beencreated yet -- apparent wraparound
BUG #15570: Vacuum analyze ERROR: MultiXactId XXXX has not beencreated yet -- apparent wraparound
От
PG Bug reporting form
Дата:
The following bug has been logged on the website: Bug reference: 15570 Logged by: Phil Hildebrand Email address: phil.hildebrand@moz.com PostgreSQL version: 10.6 Operating system: Ubuntu 16.04.5 LTS (linux) Description: While running vacuum analyze before a materialized view refresh we started seeing the following error in the logs: ERROR: MultiXactId 1483293027 has not been created yet -- apparent wraparound in the error logs Both selects against and pg_dumps of the table fail with the same error. We were able narrow down the selects to at least one row / page that seems to have problems: select ctid,id from reviews_2018 where ctid = '(382604,16)'::tid; ctid | id -------------+---------------------------------- (382604,16) | 00ec91e42d21ce0b818fe8f63ab94c66 select ctid,id from reviews_2018 where id = '00ec91e42d21ce0b818fe8f63ab94c66'; ERROR: MultiXactId 1483293027 has not been created yet -- apparent wraparound We would not expect to see this without some kind of system issue, but there is not indication there were any system or disk issues/errors at the time. As a workaround, we restored the table from a previous backup. Linux info: 4.4.0-140-generic #166-Ubuntu SMP Wed Nov 14 20:09:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux Postgres version: PostgreSQL 10.6 (Ubuntu 10.6-1.pgdg16.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609, 64-bit
On 2018-Dec-31, PG Bug reporting form wrote: > Both selects against and pg_dumps of the table fail with the same error. > > We were able narrow down the selects to at least one row / page that seems > to have problems: > > select ctid,id from reviews_2018 where ctid = '(382604,16)'::tid; > ctid | id > -------------+---------------------------------- > (382604,16) | 00ec91e42d21ce0b818fe8f63ab94c66 > > select ctid,id from reviews_2018 where id = > '00ec91e42d21ce0b818fe8f63ab94c66'; > ERROR: MultiXactId 1483293027 has not been created yet -- apparent > wraparound Can you please send select lp, lp_flags, t_xmin, t_xmax, t_field3, t_ctid, to_hex(t_infomask) as infomask, to_hex(t_infomask2) as infomask2 from heap_page_items(get_raw_page('reviews_2018', 382604)) h; Also, please send the output of pg_controldata. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Sure - beacon=# select lp, lp_flags, t_xmin, t_xmax, t_field3, t_ctid, beacon-# to_hex(t_infomask) as infomask, to_hex(t_infomask2) as infomask2 beacon-# from heap_page_items(get_raw_page('reviews_2018', 382604)) h; lp | lp_flags | t_xmin | t_xmax | t_field3 | t_ctid | infomask | infomask2 ----+----------+------------+------------+------------+--------------------+----------+----------- 1 | 1 | 0 | 2457403396 | 1048582 | (151256864,239) | 200 | 0 2 | 2 | | | | | | 3 | 2 | | | | | | 4 | 1 | 25532455 | 26137203 | 0 | (382604,15) | 2102 | c010 5 | 1 | 25532455 | 26137203 | 0 | (382604,16) | 2103 | c010 6 | 1 | 1682137913 | 862335590 | 959460400 | (892612660,26210) | 6439 | 6562 7 | 1 | 1179009631 | 1483293027 | 1382768435 | (1717064302,26444) | 3133 | 4c76 8 | 1 | 1412509997 | 825899313 | 943012409 | (959330611,90) | 0 | 0 9 | 1 | 862073140 | 1701209657 | 1265002032 | (1697722679,25911) | 652d | 3031 10 | 1 | 25518614 | 0 | 0 | (382604,10) | 2903 | 8010 11 | 2 | | | | | | 12 | 2 | | | | | | 13 | 2 | | | | | | 14 | 2 | | | | | | 15 | 1 | 26137203 | 0 | 0 | (382604,15) | 2802 | 8010 16 | 1 | 26137203 | 0 | 0 | (382604,16) | 2903 | 8010 17 | 0 | | | | | | (17 rows) and postgres@dallocalbeacondb01b:~$ pg_controldata /data/main/ pg_control version number: 1002 Catalog version number: 201707211 Database system identifier: 6619783480642413267 Database cluster state: in production pg_control last modified: Tue 01 Jan 2019 12:30:04 AM UTC Latest checkpoint location: 637/49000098 Prior checkpoint location: 637/48000098 Latest checkpoint's REDO location: 637/49000060 Latest checkpoint's REDO WAL file: 000000020000063700000049 Latest checkpoint's TimeLineID: 2 Latest checkpoint's PrevTimeLineID: 2 Latest checkpoint's full_page_writes: on Latest checkpoint's NextXID: 0:26398877 Latest checkpoint's NextOID: 7299101 Latest checkpoint's NextMultiXactId: 1 Latest checkpoint's NextMultiOffset: 0 Latest checkpoint's oldestXID: 548 Latest checkpoint's oldestXID's DB: 1 Latest checkpoint's oldestActiveXID: 26398877 Latest checkpoint's oldestMultiXid: 1 Latest checkpoint's oldestMulti's DB: 1 Latest checkpoint's oldestCommitTsXid:0 Latest checkpoint's newestCommitTsXid:0 Time of latest checkpoint: Tue 01 Jan 2019 12:30:04 AM UTC Fake LSN counter for unlogged rels: 0/1 Minimum recovery ending location: 0/0 Min recovery ending loc's timeline: 0 Backup start location: 0/0 Backup end location: 0/0 End-of-backup record required: no wal_level setting: replica wal_log_hints setting: on max_connections setting: 1000 max_worker_processes setting: 8 max_prepared_xacts setting: 1000 max_locks_per_xact setting: 64 track_commit_timestamp setting: off Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment: 16777216 Maximum length of identifiers: 64 Maximum columns in an index: 32 Maximum size of a TOAST chunk: 1996 Size of a large-object chunk: 2048 Date/time type storage: 64-bit integers Float4 argument passing: by value Float8 argument passing: by value Data page checksum version: 0 Mock authentication nonce: 65802e08ec705db32d404dc5484bdd1f1cbc4075825d6bc5b1b2a4806ad28bc0 On Mon, Dec 31, 2018 at 4:43 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > On 2018-Dec-31, PG Bug reporting form wrote: > > > Both selects against and pg_dumps of the table fail with the same error. > > > > We were able narrow down the selects to at least one row / page that seems > > to have problems: > > > > select ctid,id from reviews_2018 where ctid = '(382604,16)'::tid; > > ctid | id > > -------------+---------------------------------- > > (382604,16) | 00ec91e42d21ce0b818fe8f63ab94c66 > > > > select ctid,id from reviews_2018 where id = > > '00ec91e42d21ce0b818fe8f63ab94c66'; > > ERROR: MultiXactId 1483293027 has not been created yet -- apparent > > wraparound > > Can you please send > > select lp, lp_flags, t_xmin, t_xmax, t_field3, t_ctid, > to_hex(t_infomask) as infomask, to_hex(t_infomask2) as infomask2 > from heap_page_items(get_raw_page('reviews_2018', 382604)) h; > > Also, please send the output of pg_controldata. > > -- > Álvaro Herrera https://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Phil Hildebrand Sr. DBE @ Moz 206.696.3413
>>>>> "Phil" == Phil Hildebrand <phil.hildebrand@moz.com> writes: Phil> Sure - beacon=# select lp, lp_flags, t_xmin, t_xmax, t_field3, Phil> t_ctid, beacon-# to_hex(t_infomask) as infomask, Phil> to_hex(t_infomask2) as infomask2 beacon-# from Phil> heap_page_items(get_raw_page('reviews_2018', 382604)) h; This data page has obviously been partially overwritten by unrelated data. (Notice that many of those numbers in the output correspond to printable ASCII strings.) A hexdump of the raw page will probably make it easy to see which part is corrupt and what the offending data is. Something like this is probably the easiest way to get it (adding any other required psql options of course): psql -d beacon -AXqt -c "select get_raw_page('reviews_2018', 382604)" | perl -ne '/\\x([[:xdigit:]]+)/ and print pack("H*",$1)' | hexdump -C Assuming that doesn't reveal any sensitive data, can you send the output of that? -- Andrew (irc:RhodiumToad)
Yeah - There's no sensitive data (it's public domain reviews). I've attached the hexdump On Mon, Dec 31, 2018 at 10:57 PM Andrew Gierth <andrew@tao11.riddles.org.uk> wrote: > > >>>>> "Phil" == Phil Hildebrand <phil.hildebrand@moz.com> writes: > > Phil> Sure - beacon=# select lp, lp_flags, t_xmin, t_xmax, t_field3, > Phil> t_ctid, beacon-# to_hex(t_infomask) as infomask, > Phil> to_hex(t_infomask2) as infomask2 beacon-# from > Phil> heap_page_items(get_raw_page('reviews_2018', 382604)) h; > > This data page has obviously been partially overwritten by unrelated > data. (Notice that many of those numbers in the output correspond to > printable ASCII strings.) > > A hexdump of the raw page will probably make it easy to see which part > is corrupt and what the offending data is. Something like this is > probably the easiest way to get it (adding any other required psql > options of course): > > psql -d beacon -AXqt -c "select get_raw_page('reviews_2018', 382604)" | > perl -ne '/\\x([[:xdigit:]]+)/ and print pack("H*",$1)' | > hexdump -C > > Assuming that doesn't reveal any sensitive data, can you send the output > of that? > > -- > Andrew (irc:RhodiumToad) -- Phil Hildebrand Sr. DBE @ Moz 206.696.3413
Вложения
>>>>> "Phil" == Phil Hildebrand <phil.hildebrand@moz.com> writes: Phil> Yeah - There's no sensitive data (it's public domain reviews). Phil> I've attached the hexdump OK. The page is definitely corrupt starting at offset 0x1000 (i.e. offset 4kB in the 8kB page), but it's harder than usual to spot because (a) the tear is in the middle of a text field and (b) the data in the second half of the page is actually from a page of what is presumably a different partition of the same table (it has the same row structure, but the data is from 2017 not 2018). The split being on a 4k boundary points the finger at the hardware or OS, since pg doesn't care about 4k hardware pages or 4k disk sectors but rather does everything in 8k blocks. -- Andrew.
Ok. We didn't find any errors in syslog, in the kern log there were only some ntpd errors related to apparmor: ... Dec 19 01:55:45 dallocalbeacondb01c kernel: [960716.006995] audit: type=1400 audit(1545184545.259:1257): apparmor="DENIED" operation="open" profile="/usr/sbin/ntpd" name="/usr/local/sbin/" pid=18444 comm="ntpd" requested_mask="r" denied_mask="r" fsuid=0 ouid=0 ... That said, these are VMs running on ESX hosts with SSD, so it's certainly possible. We'll check the hosts as well. For future reference, is there a straight forward way to get all rows that dont' have any data on a given corrupt page? (rather than restore to a point in time from a backup) On Mon, Dec 31, 2018 at 11:57 PM Andrew Gierth <andrew@tao11.riddles.org.uk> wrote: > > >>>>> "Phil" == Phil Hildebrand <phil.hildebrand@moz.com> writes: > > Phil> Yeah - There's no sensitive data (it's public domain reviews). > Phil> I've attached the hexdump > > OK. The page is definitely corrupt starting at offset 0x1000 (i.e. > offset 4kB in the 8kB page), but it's harder than usual to spot because > (a) the tear is in the middle of a text field and (b) the data in the > second half of the page is actually from a page of what is presumably a > different partition of the same table (it has the same row structure, > but the data is from 2017 not 2018). > > The split being on a 4k boundary points the finger at the hardware or > OS, since pg doesn't care about 4k hardware pages or 4k disk sectors but > rather does everything in 8k blocks. > > -- > Andrew. -- Phil Hildebrand Sr. DBE @ Moz 206.696.3413
On Tue, Jan 01, 2019 at 12:13:05AM -0800, Phil Hildebrand wrote: > That said, these are VMs running on ESX hosts with SSD, so it's > certainly possible. We'll check the hosts as well. You may have been bitten by a recent bug on ESX side. Looking around there has been a release 2 months ago which fixes an issue with similar symptoms: https://docs.vmware.com/en/VMware-vSphere/6.5/rn/esxi650-201810002.html I am referring to this one in particular: "PR 2209189: Heavy I/Os issued to a snapshot virtual machine using the SEsparse format might cause guest OS file system inconsistencies." And in this case this could be referring to 4kB blocks written where they should not. -- Michael
Вложения
Thanks!
On Wed, Jan 2, 2019 at 5:53 AM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Jan 01, 2019 at 12:13:05AM -0800, Phil Hildebrand wrote:
> That said, these are VMs running on ESX hosts with SSD, so it's
> certainly possible. We'll check the hosts as well.
You may have been bitten by a recent bug on ESX side. Looking around
there has been a release 2 months ago which fixes an issue with
similar symptoms:
https://docs.vmware.com/en/VMware-vSphere/6.5/rn/esxi650-201810002.html
I am referring to this one in particular:
"PR 2209189: Heavy I/Os issued to a snapshot virtual machine using the
SEsparse format might cause guest OS file system inconsistencies."
And in this case this could be referring to 4kB blocks written where
they should not.
--
Michael
Phil Hildebrand
Sr. DBE @ Moz
206.696.3413
Greetings, * Phil Hildebrand (phil.hildebrand@moz.com) wrote: > Data page checksum version: 0 You really might want to consider fixing that and using some tool to regularly check the checksums of all the pages in the cluster, to hopefully catch something like this happening again sooner than you might otherwise, and while you still have backups you can go to and restore the data from. Thanks! Stephen
Вложения
Thanks for the tip - we'll look into enabling the page checksums as well. On Thu, Jan 3, 2019 at 1:08 PM Stephen Frost <sfrost@snowman.net> wrote: > > Greetings, > > * Phil Hildebrand (phil.hildebrand@moz.com) wrote: > > Data page checksum version: 0 > > You really might want to consider fixing that and using some tool to > regularly check the checksums of all the pages in the cluster, to > hopefully catch something like this happening again sooner than you > might otherwise, and while you still have backups you can go to and > restore the data from. > > Thanks! > > Stephen -- Phil Hildebrand Sr. DBE @ Moz 206.696.3413