Обсуждение: replication failure with GIN index
I'm trying to set up a standby server. Both the primary and standby servers are on latest version 9.1.3 on ubunt server 10.10. So far I tried to init the setup 2 times but both failed after the replication running for some time. what can I do to fix this? The log on the standby is shown below: 2012-04-06 02:31:01 CST [@] LOG: restored log file "0000000200000E3C000000F1" from archive 2012-04-06 02:35:35 CST [@] LOG: restored log file "0000000200000E3C000000F2" from archive 2012-04-06 02:36:19 CST [@] LOG: restored log file "0000000200000E3C000000F3" from archive 2012-04-06 02:36:48 CST [@] LOG: restored log file "0000000200000E3C000000F4" from archive 2012-04-06 02:37:24 CST [@] LOG: restored log file "0000000200000E3C000000F5" from archive 2012-04-06 02:37:27 CST [@] PANIC: GIN metapage disappeared 2012-04-06 02:37:27 CST [@] CONTEXT: xlog redo Update metapage, node: 37547844/16405/83896882 blkno: 4294967295 2012-04-06 02:37:28 CST [@] LOG: startup process (PID 24912) was terminated by signal 6: Aborted 2012-04-06 02:37:28 CST [@] LOG: terminating any other active server processes
On Fri, Apr 6, 2012 at 2:56 AM, Rural Hunter <ruralhunter@gmail.com> wrote: > I'm trying to set up a standby server. Both the primary and standby servers > are on latest version 9.1.3 on ubunt server 10.10. So far I tried to init > the setup 2 times but both failed after the replication running for some > time. what can I do to fix this? The log on the standby is shown below: > > 2012-04-06 02:31:01 CST [@] LOG: restored log file > "0000000200000E3C000000F1" from archive > 2012-04-06 02:35:35 CST [@] LOG: restored log file > "0000000200000E3C000000F2" from archive > 2012-04-06 02:36:19 CST [@] LOG: restored log file > "0000000200000E3C000000F3" from archive > 2012-04-06 02:36:48 CST [@] LOG: restored log file > "0000000200000E3C000000F4" from archive > 2012-04-06 02:37:24 CST [@] LOG: restored log file > "0000000200000E3C000000F5" from archive > 2012-04-06 02:37:27 CST [@] PANIC: GIN metapage disappeared > 2012-04-06 02:37:27 CST [@] CONTEXT: xlog redo Update metapage, node: > 37547844/16405/83896882 blkno: 4294967295 > 2012-04-06 02:37:28 CST [@] LOG: startup process (PID 24912) was terminated > by signal 6: Aborted > 2012-04-06 02:37:28 CST [@] LOG: terminating any other active server > processes The blkno is all wrong, so it looks like a clear bug to me. Blkno has been set to -1. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Simon Riggs <simon@2ndQuadrant.com> writes: > On Fri, Apr 6, 2012 at 2:56 AM, Rural Hunter <ruralhunter@gmail.com> wrote: >> 2012-04-06 02:37:27 CST [@] PANIC: GIN metapage disappeared Known bug, see http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=57b100fe0fb1d0d5803789d3113b89fa18a34fad >> 2012-04-06 02:37:27 CST [@] CONTEXT: xlog redo Update metapage, node: >> 37547844/16405/83896882 blkno: 4294967295 > The blkno is all wrong, so it looks like a clear bug to me. [ looks into that... ] The funny blkno is attributable to this overly-cute code: case XLOG_GIN_UPDATE_META_PAGE: appendStringInfo(buf, "Update metapage, "); desc_node(buf, ((ginxlogUpdateMeta *) rec)->node, ((ginxlogUpdateMeta *) rec)->metadata.tail); break; and we also have case XLOG_GIN_DELETE_LISTPAGE: appendStringInfo(buf, "Delete list pages (%d), ", ((ginxlogDeleteListPages *) rec)->ndeleted); desc_node(buf, ((ginxlogDeleteListPages *) rec)->node, ((ginxlogDeleteListPages *) rec)->metadata.head); break; While there could be some point in printing the list head or tail pointer, it's just confusing to print it with a label of "blkno". I think we should just print the metapage block number here and be done with it. regards, tom lane