Re: inline newNode()
От | Tom Lane |
---|---|
Тема | Re: inline newNode() |
Дата | |
Msg-id | 27833.1034039300@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: inline newNode() (Neil Conway <neilc@samurai.com>) |
Ответы |
Re: inline newNode()
Re: inline newNode() |
Список | pgsql-patches |
Neil Conway <neilc@samurai.com> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> How much did you bloat the code? There are an awful lot of calls to >> newNode(), so even though it's not all that large, I'd think the >> multiplier would be nasty. > The patch increases the executable from 12844452 to 13005244 bytes, > when compiled with '-pg -g -O2' and without being stripped. Okay, not as bad as I feared, but still kinda high. I believe that most of the bloat comes from the MemSet macro; there's just not much else in newNode(). Now, the reason MemSet expands to a fair amount of code is its if-then-else case to decide whether to call memset() or do an inline loop. I've looked at the assembler code for it on a couple of machines, and the loop proper is only about a third of the code that gets generated. Ideally, we'd like to eliminate the if-test for inlined newNode calls. That would buy back a lot of the bloat and speed things up still further. Now the tests on _val == 0 and _len <= MEMSET_LOOP_LIMIT and _len being a multiple of 4 are no problem, since _val and _len are compile-time constants; these will be optimized away. What is not optimized away (on the compilers I've looked at) is the check for _start being int-aligned. A brute-force approach is to say "we know _start is word-aligned because we just got it from palloc, which guarantees MAXALIGNment". We could make a variant version of MemSet that omits the alignment check, and use it here and anywhere else we're sure it's safe. A nicer approach would be to somehow make use of the datatype of the first argument to MemSet. If we could determine at compile time that it's supposed to point at a type with at least int alignment, then it'd be possible for the compiler to optimize away this check in a reasonably safe fashion. I'm not sure if there's a portable way to do this, though. There's no "alignof()" construct in C :-(. Any ideas? regards, tom lane
В списке pgsql-patches по дате отправления: