Re: Direct I/O
От | Andrew Dunstan |
---|---|
Тема | Re: Direct I/O |
Дата | |
Msg-id | 7bcffa12-12a1-7c7b-d68a-a9a39dba06ec@dunslane.net обсуждение исходный текст |
Ответ на | Re: Direct I/O (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: Direct I/O
|
Список | pgsql-hackers |
On 2023-04-08 Sa 17:23, Andres Freund wrote:
Hi, On 2023-04-08 17:10:19 -0400, Tom Lane wrote:Thomas Munro <thomas.munro@gmail.com> writes: Now crake is doing this: 2023-04-08 16:50:03.177 EDT [2023-04-08 16:50:03 EDT 3257645:3] 004_io_direct.pl LOG: statement: select count(*) from t1 2023-04-08 16:50:03.316 EDT [2023-04-08 16:50:03 EDT 3257646:1] ERROR: invalid page in block 56 of relation base/5/16384 2023-04-08 16:50:03.316 EDT [2023-04-08 16:50:03 EDT 3257646:2] STATEMENT: select count(*) from t1 2023-04-08 16:50:03.317 EDT [2023-04-08 16:50:03 EDT 3257645:4] 004_io_direct.pl ERROR: invalid page in block 56 of relation base/5/16384 2023-04-08 16:50:03.317 EDT [2023-04-08 16:50:03 EDT 3257645:5] 004_io_direct.pl STATEMENT: select count(*) from t1 2023-04-08 16:50:03.319 EDT [2023-04-08 16:50:02 EDT 3257591:4] LOG: background worker "parallel worker" (PID 3257646) exited with exit code 1 The fact that the error is happening in a parallel worker seems interesting ...There were a few prior instances of that error. One that I hadn't seen before is this: [11:35:07.190](0.001s) # Failed test 'read back from shared' # at /home/andrew/bf/root/HEAD/pgsql/src/test/modules/test_misc/t/004_io_direct.pl line 43. [11:35:07.190](0.000s) # got: '10000' # expected: '10098' For one it points to the arguments to is() being switched around, but that's a sideshow.It's also odd that it's just crake having the issue. It's just a linux host, afaics. Andrew, is there any chance you can run that test in isolation and see whether it reproduces? If so, does the problem vanish, if you comment out the io_direct= in the test? Curious whether this is actually an O_DIRECT issue, or whether it's an independent issue exposed by the new test. I wonder if we should make the test use data checksum - if we continue to see the wrong query results, the corruption is more likely to be in memory.
I can run the test in isolation, and it's get an error reliably.
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: