Re: Bug in point releases 9.3.6 and 9.2.10?
От | Andres Freund |
---|---|
Тема | Re: Bug in point releases 9.3.6 and 9.2.10? |
Дата | |
Msg-id | 20150313002150.GA18401@alap3.anarazel.de обсуждение исходный текст |
Ответ на | Bug in point releases 9.3.6 and 9.2.10? (Peter Geoghegan <pg@heroku.com>) |
Ответы |
Re: Bug in point releases 9.3.6 and 9.2.10?
|
Список | pgsql-hackers |
Hi, On 2015-03-12 16:42:24 -0700, Peter Geoghegan wrote: > We want to create a new role when this happens, for various reasons. > This occurs after recovery ends, but before the database has been > "unfenced". The template code that generates various ALTER ROLE > statements in our internal provisioning system - which has apparently > worked just fine for a long time - is: Is this all the code that's exececuted after recovery? How are these forks brought up? Promoted how? Is it a common 'source' database? > db.execute("ALTER ROLE #{old_database_user} RENAME TO #{database_user}") > db.execute("ALTER ROLE #{database_user} PASSWORD '#{database_password}' LOGIN") > db.execute("CREATE ROLE \"#{old_database_user}\" PASSWORD > '#{old_database_password}' IN ROLE \"#{database_user}\" LOGIN") > > I've seen multiple reports of apparent corruption, appearing as the > resulting ALTER ROLE statements are executed: > > PG::DataCorrupted: ERROR: could not read block 0 in file > "global/12811": read only 0 of 8192 bytes > or: > PG::DataCorrupted: ERROR: could not read block 0 in file > "global/12785": read only 0 of 8192 bytes > or: > PG::DataCorrupted: ERROR: could not read block 0 in file > "global/12811": read only 0 of 8192 bytes Have you looked at these files? Are they indeed zero bytes when this error occurs? Do you still have a base backup from the relevant time, so you could repeat the whole thing? > The only common factor is that this occurs on the latest point > releases (either 9.3.6 and 9.2.10, at least so far). In all cases I've > seen so far, the relation in question is the pg_auth_members heap > relation. For example: Any chance that the new nodes also use a different kernel version or such? > redacteddb=# select pg_relation_filenode(oid), oid, relname, relkind > from pg_class where pg_relation_filenode(oid) = 12785; > pg_relation_filenode | oid | relname | relkind > ----------------------+------+-----------------+--------- > 12785 | 1261 | pg_auth_members | r > (1 row) This filenode got to be pg_auth_member's original one, given it's below FirstNormalObjectId. I get a lower value, but that's probably caused by having fewer collations and other data generated during initdb. That implies that the table hasn't ever been rewritten. What's 12811? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
В списке pgsql-hackers по дате отправления: