Re: Curious buildfarm failures (fwd)
От | Tom Lane |
---|---|
Тема | Re: Curious buildfarm failures (fwd) |
Дата | |
Msg-id | 14929.1358317689@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Curious buildfarm failures (fwd) (Andres Freund <andres@2ndquadrant.com>) |
Ответы |
Re: Curious buildfarm failures (fwd)
(Andres Freund <andres@2ndquadrant.com>)
|
Список | pgsql-hackers |
It's a compiler bug. icc 11.1 apparently thinks that this loop in doPickSplit: /* * Update nodes[] array to point into the newly formed innerTuple, so that * we can adjust their downlinks below. */ SGITITERATE(innerTuple, i, node) { nodes[i] = node; } is going to be iterated hundreds of times, because it expends a ridiculous amount of effort on it, starting with a loop prolog that prefetches about a thousand bytes starting at the nodes pointer :-( (Why does it think it needs to prefetch an array it's only going to write into? Is IA64's cache hardware really that stupid?) And it makes use of IA64's bizarre scheme for software-unrolling loops, which I am going to do my darnedest to forget now that I've learned it; but for the purposes of this bit you only need to know that the br.wtop.dptk instruction "renames" registers 32 and up, so that whatever is in r32 when the bottom of the loop is reached will be in r33 on the next iteration, and r33's contents move to r34, etc. In this particular example, this ridiculous complication saves a grand total of no instructions, but nevermind that. Before starting the loop, the code has computed r28 = innerTuple r29 = nodes r26 = r29 + 1200 (this is where it will continue the prefetching...) r33 = 0 r35 = innerTuple + innerTuple->prefixSize + 8 (ie, the initial value of "nt") r27 = innerTuple + innerTuple->prefixSize + 8 + 6 And the body of the SGITERATE loop looks like .b4_110: at top of loop, r35 contains "nt" pointer, r33 contains "i"(p17) st8 [r29]=r35,8 //0: {940:3} 4456 0 store nt at *r29, increment r29 by 8 bytes (thus, assign to nodes[i])(p17) add r32=1,r33 //0: {938:2}4453 0 compute i+1, will be next value of i due to register rename(p17) ld2 r36=[r28] //1: {938:2} 4462 0 fetch first 2 bytes of innerTuple(p17) ld2 r34=[r27],r33 //1: {938:2} 4459 0 fetch last 2 bytes of node tuple, on first iteration anyway ... and then add the value of r33 to r27, which is all wrong(p17) extr.u r37=r36,3,13 //2: {938:2} 44630 extract nNodes from fetched 2 bytes of innerTuple(p17) extr.u r33=r34,0,13 ;; //2: {938:2} 4460 0 extract size field of node tuple, or so it hopes(p17) lfetch.nt1 [r26],8 //3: {938:2} 4454 0 useless prefetch more than a thousand bytes away from the action(p17) cmp4.lt p16,p0=r32,r37 //3: {938:2}4464 0 compare whether r32 (next value of i) < nNodes(p17) add r34=r35,r33 //3: {938:2} 4461 0 set r34 (next value of r35) to r35 + size field, or so it hopes(p16) br.wtop.dptk .b4_110 ;; //3: {938:2}4465 0 rename the registers and do it again, if the cmp4 returned true The problem with this code is that r27, which ought to be always equal to r35 + 6, is incremented by the wrong amount in the second ld2 instruction, namely by the "i" counter. The value that *should* get added to it is the node size field, ie the same value that's loaded into r33 below that and then added to r35 in the last add instruction (and then stored into r34, which is about to become r35). So I think the compiler has outsmarted itself as to which rotating register contains which value when. The result of this breakage is that the set of node pointers computed by the loop is totally wrong for all values after the first. This means the later loop that's trying to insert the now-known downlink TIDs into the innerTuple's nodes is storing those TIDs into random locations, and thus tromping all over memory. The case where we get a reproducible crash is where the Asserts in that loop notice that what's at the pointed-to addresses isn't what's expected, before we manage to clobber anything critical. Diagnosis: icc 11.1 is not ready for prime time. I shall now retire with a glass of wine and attempt to forget everything I just learned about IA64. What a bizarre architecture ... regards, tom lane
В списке pgsql-hackers по дате отправления: