Re: postgres has no spinlock support on riscv rv64imafdc
От | Thomas Munro |
---|---|
Тема | Re: postgres has no spinlock support on riscv rv64imafdc |
Дата | |
Msg-id | CA+hUKGL1m-zJeFvY_yOAJbNUh-aPR5fk_zHM=HuD6KVzYRhCsg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: postgres has no spinlock support on riscv rv64imafdc (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-bugs |
On Fri, Apr 16, 2021 at 2:04 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.munro@gmail.com> writes: > > As a data point: the obvious change builds and passes basic testing > > with flying colours, on FreeBSD + clang 11 running under qemu. RISC-V > > has been "non-experimental" since clang 9. FWIW I've signed up to the > > preorder list to try to get my hands on some BeagleV hardware to test > > this stuff for real ASAP... > > Yeah, I'm on that list too ... crickets so far. I passed this change on to the brave souls who are bringing up the whole universe of open source software on this new architecture, in the FreeBSD ports tree. PostgreSQL's failure to compile was blocking hundreds of dependent packages so it seemed good to at least try something at this early stage (other databases are still blockers; apparently you can trust database hackers to use weird arch-specific stuff). It'll be interesting to see if any reports of brokenness come in from that, but for now I just know that a whole mountain of new stuff now builds there. One thing I noticed while worrying about whether it might have insufficiently strong barriers is that Clang and GCC are generating different code: __sync_lock_test_and_set(): Clang 11: amoswap.w.aqrl a0,a1,(a0) GCC 9: amoswap.w.aq a4,a4,(a5) __sync_lock_release(): Clang 11: fence rw,w li a1,0 sw a1,0(a0) GCC 9: fence iorw,ow amoswap.w zero,zero,(a5) S_LOCK() looks a bit like figure A7 of the manual[1], with at least acquire semantics (but also release on Clang), and S_UNLOCK() looks a bit like figure A8, with an explicit release. I don't immediately know why acquire and release semantics wouldn't be enough here. Anyone? I do see that ARM, Itanium and PowerPC include a full barrier, though in different places: on ARM, __sync_lock_test_and_set() on ARM generates ldxr, stxr, dmb (= full memory barrier) rather than using ldaxr (= acquire) and stlxr (= release), and this seems to be since [2], which is quite interesting and might be a clue; on Itanium and PowerPC, it's explicit in S_UNLOCK(). Some of the questions I have are: can anyone with real hardware access reproduce the failure reported back in 2016[3], or was that entirely due to the pre-alpha toolchain? If it can be reproduced, is it with GCC only and not Clang, and does adding an initial __sync_synchronize() help? [1] https://riscv.org/wp-content/uploads/2019/06/riscv-spec.pdf [2] https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=a96297a28065d27559d17ebe1e0eda308a05e965 [3] https://www.postgresql.org/message-id/20161120184942.GP11243%40redhat.com
В списке pgsql-bugs по дате отправления: