"unexpected duplicate for tablespace" problem in logical replication
От | wangsh.fnst@fujitsu.com |
---|---|
Тема | "unexpected duplicate for tablespace" problem in logical replication |
Дата | |
Msg-id | bbaaf9f9-ebb2-645f-54bb-34d6efc7ac42@fujitsu.com обсуждение исходный текст |
Ответы |
RE: "unexpected duplicate for tablespace" problem in logical replication
|
Список | pgsql-bugs |
Hi, I met a problem while using logical replication in PG11 and I think all the PG version have this problem. The log looks like: > ERROR: unexpected duplicate for tablespace 0, relfilenode xxxxxxx Someone also reported this problem in [1], but no one has responded to it. I did some investigation, and found a way to reproduce this problem. The steps are: 1. create a table (call it tableX) and truncate it. 2. cycle through 2^32 OIDs. 3. restart the database to clear all the cache. 4. create a temp table which make the temp table's OID equals to the tableX's relfilenode and insert any data into tableX. The attachment(run.sh) can reproduce this problem in PG10 and PG11with the help of option 'WITH OIDS'. I don't find any way to cycle the OIDs quickly in branch master, but I use the gdb to reproduce this problem too. Now, function GetNewRelFileNode() only checks: 1. duplicated OIDs in pg_class. 2. relpath(rnode) is exists in disk. However, the result of relpath(temp table) and relpath(non-temp table) are different, temp table's relpath() has a prefix "t%d". That means, if there is a table that value of relfilenode is 20000(but the value of oid isn't 20000), it's possible to create a temp table that value of relfilenode is also 20000. I think function GetNewRelFileNode() should always check the duplicated relfilenode, see the patch(a simple to way to fix this problem is master branch). Any comment? Regards, Shenhao Wang [1] https://www.postgresql.org/message-id/flat/CAM5YvKTPxmMT%3DS7iPcu5SgmaOv4S4nhE1HZRO_sdFX9cXeXXOQ%40mail.gmail.com
Вложения
В списке pgsql-bugs по дате отправления: