Обсуждение: Re: Alpha tas() patch

Поиск
Список
Период
Сортировка

Re: Alpha tas() patch

От
Tom Lane
Дата:
Brent Verner <brent@rcfile.org> writes:
>   This is a revised patch that I sent earlier to allow building
> pg-7.1 with gcc as well as DEC's cc. I've had good results with this
> applied. Could some other Alpha users try this out. Even better, could
> an Alpha asm guru look over the asm that I'm using (instead of the
> original asm in the file).

tas() is not supposed to contain a loop.  It can succeed or fail, but
it should not wait.

The code now in s_lock.h does seem rather gratuitously obscure about
the instructions it uses to load constants.  I'd suggest

static __inline__ int
tas(volatile slock_t *lock)
{register slock_t _res;

__asm__("     ldq   $0, %0           \n\             bne   $0, 2f           \n\             ldq_l $0, %0           \n\
          bne   $0, 2f           \n\             mov   1, $0            \n\             stq_c $0, %0           \n\
      beq   $0, 2f           \n\             mov   0, %1            \n\             mb                     \n\
  jmp   $31, 3f          \n\          2: mov   1, %1            \n\          3: nop      ": "=m"(*lock), "=r"(_res):
:"0");
   return (int) _res;
}

As you say, the first two instructions don't seem to be really
necessary.  I suppose the idea is to avoid setting the processor
lock address register unless there's a pretty good chance of
acquiring the lock ... but why bother?  Does LDQ_L take a long
time to execute?  Seems like avoiding the extra two instructions
would be a win most of the time.
           regards, tom lane


Re: Alpha tas() patch

От
Brent Verner
Дата:
On 27 Dec 2000 at 21:37 (-0500), Tom Lane wrote:
| Brent Verner <brent@rcfile.org> writes:
| >   This is a revised patch that I sent earlier to allow building
| > pg-7.1 with gcc as well as DEC's cc. I've had good results with this
| > applied. Could some other Alpha users try this out. Even better, could
| > an Alpha asm guru look over the asm that I'm using (instead of the
| > original asm in the file).
| 
| tas() is not supposed to contain a loop.  It can succeed or fail, but
| it should not wait.
| 
| The code now in s_lock.h does seem rather gratuitously obscure about
| the instructions it uses to load constants.  I'd suggest
| 
| static __inline__ int
| tas(volatile slock_t *lock)
| {
|  register slock_t _res;
| 
| __asm__("     ldq   $0, %0           \n\
|               bne   $0, 2f           \n\
|               ldq_l $0, %0           \n\
|               bne   $0, 2f           \n\
|               mov   1, $0            \n\
|               stq_c $0, %0           \n\
|               beq   $0, 2f           \n\
|               mov   0, %1            \n\
|               mb                     \n\
|               jmp   $31, 3f          \n\
|            2: mov   1, %1            \n\
|            3: nop      ": "=m"(*lock), "=r"(_res): :"0");
| 
|     return (int) _res;
| }

another loop-free version of TAS that /seems/ to work as desired. WRT to your
seeing the shutdown lock failure, what are you doing to provoke this bug?, 
because I have not seen it with either version of the TAS that I've used.
brent


#define TAS(lock) tas_dbv(lock)
#define S_UNLOCK(lock) do { __asm__("mb"); *(lock) = 0; } while (0)

static
__inline__
int
tas(volatile slock_t* __lock)
{  register slock_t  __rv;  register slock_t  __temp;
  __asm__  __volatile__  (  "1:  ldq_l %0, %1       \n" /* load (and lock) __lock                 */  "    and   %0, 1,
%2   \n" /* if bit 1 is set, store 1 in __rv       */  "    bne   %2, 2f       \n" /* if __rv != 0, already locked.
leave   */  "    xor   %0, 1, %0    \n" /* set bit 1 in slock_t                   */  "    stq_c %0, %1       \n" /*
store__lock, only stores if we locked */  "    mb                 \n" /* memory block (necessary?)              */  "
2: nop              \n" /* exit point                             */    : "=&r" (__temp),      "=m"  (*__lock),
"=&r"(__rv)    : "m"   (*__lock)  );
 
  return (int) __rv;
}



Re: Re: Alpha tas() patch

От
Adriaan Joubert
Дата:
Hi,
I missed the beginning of this thread. Are you doing this for Tru64 or
for Linux? For Tru64 there are macros in /usr/include/alpha/builtins.h
which do the job.

Doing this in assembler is totally non-trivial, as most versions are
only liable to work on single-processor machines and not on SMP boxes
(the problem with the previous linux TAS, I believe).

Adriaan


Re: Alpha tas() patch

От
Tom Lane
Дата:
Brent Verner <brent@rcfile.org> writes:
> another loop-free version of TAS that /seems/ to work as desired.

Since it doesn't check to see if the stq_c succeeded, it can't possibly
be right...

> WRT to your seeing the shutdown lock failure, what are you doing to
> provoke this bug?

"pg_ctl stop".

I am wondering if the machine I am testing on is different from yours.
I know there are multiple revisions of Alpha hardware out there; is
there a way to test which mask rev is running a particular machine?
        regards, tom lane


Re: Re: Alpha tas() patch

От
Brent Verner
Дата:
On 28 Dec 2000 at 17:03 (+0200), Adriaan Joubert wrote:
| Hi,
| 
|     I missed the beginning of this thread. Are you doing this for Tru64 or
| for Linux? For Tru64 there are macros in /usr/include/alpha/builtins.h
| which do the job.

gcc + Tru64, since gcc-2.95.2 doesn't implement the builtins defined in
alpha/builtins.h. Having never seen linux on an Alpha box, I can only assume 
that the asm (instruction-wise) would be(have) the same.
brent


Re: Re: Alpha tas() patch

От
Tom Lane
Дата:
Adriaan Joubert <a.joubert@albourne.com> writes:
> For Tru64 there are macros in /usr/include/alpha/builtins.h
> which do the job.

It would be interesting to see the assembly code that those macros
expand to.

> Doing this in assembler is totally non-trivial, as most versions are
> only liable to work on single-processor machines and not on SMP boxes
> (the problem with the previous linux TAS, I believe).

You've said that before, but whether it's difficult or not is completely
irrelevant.  We have to develop a version that works; there is no
alternative.

Has anyone dug into the kernel sources to see what's used there for
cross-processor locks?  It should be pretty much the same problem.
        regards, tom lane


Re: Alpha tas() patch

От
Brent Verner
Дата:
On 28 Dec 2000 at 10:48 (-0500), Tom Lane wrote:
| Brent Verner <brent@rcfile.org> writes:
| > another loop-free version of TAS that /seems/ to work as desired.
| 
| Since it doesn't check to see if the stq_c succeeded, it can't possibly
| be right...

right, I just realized that while composing a now-deleted response to 
Adriaan :-)

back to the TAS-loop issue. when using 

| > WRT to your seeing the shutdown lock failure, what are you doing to
| > provoke this bug?
| 
| "pg_ctl stop".

I see this with the version of TAS() that you recently suggested, but not 
with either of the versions I'd hacked up.

| I am wondering if the machine I am testing on is different from yours.
| I know there are multiple revisions of Alpha hardware out there; is
| there a way to test which mask rev is running a particular machine?

I have no clue right now, but I'll look into it.
brent


Re: Alpha tas() patch

От
Tom Lane
Дата:
Brent Verner <brent@rcfile.org> writes:
> I see this with the version of TAS() that you recently suggested, but not 
> with either of the versions I'd hacked up.

Hm.  Your second version might incorrectly appear to work because it's
not checking for stq_c failure.  The first one loops until it succeeds,
so if the deal is that stq_c might work sometimes but not always
(dependent on, say, cache miss effects) then that might explain why it
works.  Clearly there's more here to worry about than the available
documentation suggests.

Awhile back I received some mail from a guy at Compaq saying that a
taken branch between ldq_l and stq_c is no good, apparently because some
versions of the processor will clear the lock register when any transfer
of control occurs.  But it sounded like a not-taken conditional branch
is OK.  Nonetheless, it's interesting that the
__INTERLOCKED_TESTBITSS_QUAD macro doesn't use any branch at all between
those two instructions.  Maybe we should try it that way.

Is there any documentation on exactly what __INTERLOCKED_TESTBITSS_QUAD
is supposed to do?  It looks like it's computing which bit to set in the
quadword, but I'm not sure I'm interpreting the code correctly.

I'm going to instrument my version a little more to see exactly which
step is failing ... I suspect it's the stq_c result test but want to
prove that before trying alternatives.

BTW, is your machine single- or multi-processor?
        regards, tom lane


Re: Alpha tas() patch

От
Brent Verner
Дата:
On 28 Dec 2000 at 12:41 (-0500), Tom Lane wrote:
| Brent Verner <brent@rcfile.org> writes:
| > I see this with the version of TAS() that you recently suggested, but not 
| > with either of the versions I'd hacked up.
| 
| Hm.  Your second version might incorrectly appear to work because it's
| not checking for stq_c failure.  The first one loops until it succeeds,
| so if the deal is that stq_c might work sometimes but not always
| (dependent on, say, cache miss effects) then that might explain why it
| works.  Clearly there's more here to worry about than the available
| documentation suggests.

what have I stumbled into :).  'damnit Jim!, I'm just a perl hacker.'

| Awhile back I received some mail from a guy at Compaq saying that a
| taken branch between ldq_l and stq_c is no good, apparently because some
| versions of the processor will clear the lock register when any transfer
| of control occurs.  But it sounded like a not-taken conditional branch
| is OK.  Nonetheless, it's interesting that the
| __INTERLOCKED_TESTBITSS_QUAD macro doesn't use any branch at all between
| those two instructions.  Maybe we should try it that way.

I agree with this assessment from my reading of the asm docs so far. there
is also the (w)mb instruction whose importance/effect I'm not sure of...

| Is there any documentation on exactly what __INTERLOCKED_TESTBITSS_QUAD
| is supposed to do?  It looks like it's computing which bit to set in the
| quadword, but I'm not sure I'm interpreting the code correctly.

yes, WRT to the 'which bit', since __INTERLOCKED_TESTBITSS_QUAD takes 
as second arg, which is the bit to set.

| I'm going to instrument my version a little more to see exactly which
| step is failing ... I suspect it's the stq_c result test but want to
| prove that before trying alternatives.

my understanding of the ldq_l/stc_q interaction, is: (forgive me if you
already grok all of this, I'm mostly thinking out loud.)

ldq_l $0, $1       # $0 = $1;
addq  $0,  1, $2   # $2 = $0 + 1;
stq_c $2, $1       # $1 = $2;
cmpeq $2, $1, $3   # $3 = $2 == $1 ? 1 : 0

if $3 == 1 we successfully modified the locked value ($1). I /believe/ this
will hold true across multi-processors as well, as all of the builtins
I've seen asm for seem to have similar structure.

| BTW, is your machine single- or multi-processor?

single.
brent


Re: Alpha tas() patch

От
Tom Lane
Дата:
Brent Verner <brent@rcfile.org> writes:
> what have I stumbled into :).  'damnit Jim!, I'm just a perl hacker.'

I've found an online version of the AXP Architecture Handbook at 
ftp://ftp.netbsd.org/pub/NetBSD/misc/dec-docs/index.html
in particular the file ec-qd2ka-te.ps.gz listed near the top of
the page.  After perusing same, it seems that the tas() code presently
in CVS sources *is correct*, although not very transparently written
(for example, "or  $31, 1, $0" would be more clearly expressed as
"mov 1, $0").  See sections 4.2.4, 4.2.5, 4.11.4, and 5.5, especially
the sample code for a software critical section in 5.5.3.

At this point my opinion is that there's nothing functionally wrong
with the existing Alpha assembly code in s_lock.h, and that the
CreateCheckPoint failure is due to CreateCheckPoint misusing the routine
(see my recent message to pghackers).

I believe that we should change s_lock.h to use the inline assembly code
when using gcc on Alpha, regardless of OS.

The more interesting question is what to do for non-gcc compilers.
The INTERLOCKED_TESTBITSS_QUAD macro is not really suitable, primarily
because it does not include an "mb" instruction.  (It also violates the
architecture manual's recommendation to not do redundant stores into a
lock word, but that's not so critical.)  Is there anything in
<alpha/builtins.h> that provides access to the memory-barrier
instruction?
        regards, tom lane


Re: Alpha tas() patch

От
Brent Verner
Дата:
On 28 Dec 2000 at 17:40 (-0500), Tom Lane wrote:
| Okay ... I guess the LOCK_LONG macros are our best shot.  Here is a
| proposed new Alpha section for s_lock.h.  Would you try it and let me
| know how it works for you?
| 
| Note that this will NOT fix the CreateCheckPoint shutdown error; don't
| worry about that.  The thing to look at is whether the parallel regress
| tests pass.

after fresh CVS update: geometry, float8, and oid are still failing, and I
now am seeing the CheckPoint shutdown bug, which leaves the db in an unusable
state after running the regression tests.

mnemosyne:~/src/gcc-pgsql
pgadmin$ pg_ctl start
postmaster successfully started up
mnemosyne:~/src/gcc-pgsql
pgadmin$ DEBUG:  starting up
DEBUG:  database system was interrupted at 2000-12-28 21:28:39
DEBUG:  CheckPoint record at (0, 1563064)
DEBUG:  Redo record at (0, 1563064); Undo record at (0, 0); Shutdown TRUE
DEBUG:  NextTransactionId: 615; NextOid: 18720
DEBUG:  database system was not properly shut down; automatic recovery in progress...
DEBUG:  redo starts at (0, 1563128)
mnemosyne:~/src/gcc-pgsql
pgadmin$ Startup failed - abort



Re: Alpha tas() patch

От
Tom Lane
Дата:
> after fresh CVS update: geometry, float8, and oid are still failing,

You're running this on DEC's cc, right?  Geometry and float8 are a
matter of fixing the expected output, I think.  I'm surprised that the
OID test is failing for you --- it passes for me on that Debian box.
Could you step through the oidin_subr routine in
src/backend/utils/adt/oid.c and see what it's getting for intermediate
results?

> I now am seeing the CheckPoint shutdown bug, which leaves the db in an
> unusable state after running the regression tests.

> mnemosyne:~/src/gcc-pgsql
> pgadmin$ pg_ctl start
> postmaster successfully started up
> mnemosyne:~/src/gcc-pgsql
> pgadmin$ DEBUG:  starting up
> DEBUG:  database system was interrupted at 2000-12-28 21:28:39
> DEBUG:  CheckPoint record at (0, 1563064)
> DEBUG:  Redo record at (0, 1563064); Undo record at (0, 0); Shutdown TRUE
> DEBUG:  NextTransactionId: 615; NextOid: 18720
> DEBUG:  database system was not properly shut down; automatic recovery in progress...
> DEBUG:  redo starts at (0, 1563128)
> mnemosyne:~/src/gcc-pgsql
> pgadmin$ Startup failed - abort

Oh, that's interesting.  On the Debian box, restart works fine after
CheckPoint barfs.  We need to look into that and find out why --- if
restart fails in this situation, it would imply that restart after a
system crash would probably fail too.  Can you trace through it and
figure out where it's failing?
        regards, tom lane


Re: Alpha tas() patch

От
Brent Verner
Дата:
On 28 Dec 2000 at 23:08 (-0500), Tom Lane wrote:
| > after fresh CVS update: geometry, float8, and oid are still failing,
| 
| You're running this on DEC's cc, right?  Geometry and float8 are a
| matter of fixing the expected output, I think.  I'm surprised that the
| OID test is failing for you --- it passes for me on that Debian box.
| Could you step through the oidin_subr routine in
| src/backend/utils/adt/oid.c and see what it's getting for intermediate
| results?

no, these results are with gcc, sorry for not stating that :(. With DEC's 
cc, I do not see the CheckPoint bug, the oid and geometry regression tests
still fail (same result as before new TAS code, so the it appears that
__LOCK_LONG_RETRY() does the right thing for us).

| > I now am seeing the CheckPoint shutdown bug, which leaves the db in an
| > unusable state after running the regression tests.

| > mnemosyne:~/src/gcc-pgsql
| > pgadmin$ Startup failed - abort
| 
| Oh, that's interesting.  On the Debian box, restart works fine after
| CheckPoint barfs.  We need to look into that and find out why --- if
| restart fails in this situation, it would imply that restart after a
| system crash would probably fail too.  Can you trace through it and
| figure out where it's failing?

Yes, I'll dig into this, but won't be able to do so until late(r) in
the day.

cheers.brent