Обсуждение: pg_stat_database shows userid as OID

Поиск
Список
Период
Сортировка

pg_stat_database shows userid as OID

От
Alvaro Herrera
Дата:
Hello hackers,

In the pg_stat_activity view, the usesysid is shown as having type Oid.
However pg_shadow says it's an integer.  Is there a reason? Looks like
a bug.

This patch seems to corrects this issue, but I don't know if there's
something else involved.

Index: src/include/catalog/pg_proc.h
===================================================================
RCS file: /projects/cvsroot/pgsql-server/src/include/catalog/pg_proc.h,v
retrieving revision 1.276
diff -c -r1.276 pg_proc.h
*** src/include/catalog/pg_proc.h    2002/11/08 17:27:03    1.276
--- src/include/catalog/pg_proc.h    2002/11/16 23:18:44
***************
*** 2738,2744 **** DESCR("Statistics: PID of backend"); DATA(insert OID = 1938 (  pg_stat_get_backend_dbid        PGNSP
PGUID12 f f t f s 1 26 "23"    pg_stat_get_backend_dbid - _null_ )); DESCR("Statistics: Database ID of backend");
 
! DATA(insert OID = 1939 (  pg_stat_get_backend_userid    PGNSP PGUID 12 f f t f s 1 26 "23"
pg_stat_get_backend_userid- _null_ )); DESCR("Statistics: User ID of backend"); DATA(insert OID = 1940 (
pg_stat_get_backend_activity   PGNSP PGUID 12 f f t f s 1 25 "23"    pg_stat_get_backend_activity - _null_ ));
DESCR("Statistics:Current query of backend");
 
--- 2738,2744 ---- DESCR("Statistics: PID of backend"); DATA(insert OID = 1938 (  pg_stat_get_backend_dbid        PGNSP
PGUID12 f f t f s 1 26 "23"    pg_stat_get_backend_dbid - _null_ )); DESCR("Statistics: Database ID of backend");
 
! DATA(insert OID = 1939 (  pg_stat_get_backend_userid    PGNSP PGUID 12 f f t f s 1 23 "23"
pg_stat_get_backend_userid- _null_ )); DESCR("Statistics: User ID of backend"); DATA(insert OID = 1940 (
pg_stat_get_backend_activity   PGNSP PGUID 12 f f t f s 1 25 "23"    pg_stat_get_backend_activity - _null_ ));
DESCR("Statistics:Current query of backend");
 
Index: src/backend/utils/adt/pgstatfuncs.c
===================================================================
RCS file: /projects/cvsroot/pgsql-server/src/backend/utils/adt/pgstatfuncs.c,v
retrieving revision 1.8
diff -c -r1.8 pgstatfuncs.c
*** src/backend/utils/adt/pgstatfuncs.c    2002/08/20 04:47:52    1.8
--- src/backend/utils/adt/pgstatfuncs.c    2002/11/16 23:18:44
***************
*** 272,278 ****     if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL)         PG_RETURN_NULL(); 
!     PG_RETURN_OID(beentry->userid); }  
--- 272,278 ----     if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL)         PG_RETURN_NULL(); 
!     PG_RETURN_INT32(beentry->userid); }  
-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"El miedo atento y previsor es la madre de la seguridad" (E. Burke)


Re: pg_stat_database shows userid as OID

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> In the pg_stat_activity view, the usesysid is shown as having type Oid.
> However pg_shadow says it's an integer.  Is there a reason?

There's been disagreement for a long time over whether userids should be
OIDs or ints.  If you want to introduce consistency then it's going to
take a lot more than a one-line patch.  (First you'll need to convince
the partisans involved which answer is the right one.)

> Looks like a bug.

Not as long as OID is 4 bytes.

I'd recommend not making any piecemeal changes, especially not when
there's not yet a consensus which way to converge.
        regards, tom lane


Re: pg_stat_database shows userid as OID

От
Alvaro Herrera
Дата:
On Sun, Nov 17, 2002 at 01:16:29PM -0500, Tom Lane wrote:
> Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> > In the pg_stat_activity view, the usesysid is shown as having type Oid.
> > However pg_shadow says it's an integer.  Is there a reason?
> 
> There's been disagreement for a long time over whether userids should be
> OIDs or ints.  If you want to introduce consistency then it's going to
> take a lot more than a one-line patch.  (First you'll need to convince
> the partisans involved which answer is the right one.)

Oh, I see.  I wasn't aware of this.  I don't really know which answer is
"the right one".  I don't care a lot about this thing either, but I'll
keep it into my list of amusements, and will probably even dig into
the archives sometime.

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
Criptografía: Poderosa técnica algorítmica de codificación que es
empleada en la creación de manuales de computadores.


Re: pg_stat_database shows userid as OID

От
Bruce Momjian
Дата:
Tom Lane wrote:
> Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> > In the pg_stat_activity view, the usesysid is shown as having type Oid.
> > However pg_shadow says it's an integer.  Is there a reason?
> 
> There's been disagreement for a long time over whether userids should be
> OIDs or ints.  If you want to introduce consistency then it's going to
> take a lot more than a one-line patch.  (First you'll need to convince
> the partisans involved which answer is the right one.)
> 
> > Looks like a bug.
> 
> Not as long as OID is 4 bytes.
> 
> I'd recommend not making any piecemeal changes, especially not when
> there's not yet a consensus which way to converge.

Well, seems we should make it consistent at least.  Let's decide and
make it done.  I think some wanted it to be an int so they could use the
same unix uid for pg_shadow, but I think we aren't using that idea much
anymore.  However, right now, it looks like the super user is '1', and
other users start numbering from 100.  That at least suggests int rather
than oid.

I am not particular in what we choose, but I do think there is a good
argument to make it consistent.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


CLUSTER ALL syntax

От
Bruce Momjian
Дата:
In looking at the CLUSTER ALL patch I have applied, I am now wondering
why the ALL keyword is used.  When we do VACUUM, we don't use ALL. 
VACUUM vacuums all tables.  Shouldn't' CLUSTER alone do the same thing. 
And what about REINDEX?  That seems to have a different syntax from the
other two.  Seems there should be some consistency.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: pg_stat_database shows userid as OID

От
Tom Lane
Дата:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom Lane wrote:
>> I'd recommend not making any piecemeal changes, especially not when
>> there's not yet a consensus which way to converge.

> Well, seems we should make it consistent at least.

I think the original argument stemmed from the idea that we ought to use
pg_shadow's OID column as the user identifier (eliminating usesysid per
se).  This seems like a good idea at first but I think it has a couple
of fatal problems: * disappearance of pg_shadow.usesysid column will doubtless break some   applications * if we use
OIDthen it's much more difficult to support explicit   assignment of userid
 

> I think some wanted it to be an int so they could use the
> same unix uid for pg_shadow, but I think we aren't using that idea much
> anymore.

I don't think anyone worries about making usesysid match /etc/passwd
anymore, but nonetheless CREATE USER WITH SYSID is still an essential
capability.  What if you drop a user accidentally while he still owns
objects?  You *must* be able to recreate him with the same sysid as
before.  pg_depend cannot save us from this kind of mistake, either,
since users span databases.

So it seems to me that we must keep pg_shadow.usesysid as a separate
column and not try to make it the OID of pg_shadow.

Given that decision, the argument for making it be type OID seems very
weak, so I'd lean to the "use int4" camp myself.  But I'm not sure
everyone agrees.  I think Peter was strongly in favor of OID when he
was revising the session-authorization code (that's why it all uses OID
for user IDs...)

As far as the actual C code goes, I'd lean to creating new typedefs
UserId and GroupId (or some such names) and making all the routine
and variable declarations use those, and not either OID or int4.
But I'm not excited about promoting these typedefs into actual SQL
types, as was done for TransactionId and CommandId; the payback seems
much less than the effort needed.
        regards, tom lane


Re: CLUSTER ALL syntax

От
Tom Lane
Дата:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> In looking at the CLUSTER ALL patch I have applied, I am now wondering
> why the ALL keyword is used.  When we do VACUUM, we don't use ALL. 
> VACUUM vacuums all tables.  Shouldn't' CLUSTER alone do the same thing. 

I agree, lose the ALL.

> And what about REINDEX?  That seems to have a different syntax from the
> other two.  Seems there should be some consistency.

We don't have a REINDEX ALL, and I'm not in a hurry to invent one.
(Especially, I'd not want to see Alvaro spending time on that instead
of fixing the underlying btree-compaction problem ;-))
        regards, tom lane


Re: CLUSTER ALL syntax

От
Alvaro Herrera
Дата:
On Sun, Nov 17, 2002 at 04:42:01PM -0500, Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > In looking at the CLUSTER ALL patch I have applied, I am now wondering
> > why the ALL keyword is used.  When we do VACUUM, we don't use ALL. 
> > VACUUM vacuums all tables.  Shouldn't' CLUSTER alone do the same thing. 
> 
> I agree, lose the ALL.

Well, in my original patch (the one submitted just when 7.3 was going
into beta) there was no ALL.  I decided to put it in for subsequent
patches for no good reason.

> > And what about REINDEX?  That seems to have a different syntax from the
> > other two.  Seems there should be some consistency.
> 
> We don't have a REINDEX ALL, and I'm not in a hurry to invent one.
> (Especially, I'd not want to see Alvaro spending time on that instead
> of fixing the underlying btree-compaction problem ;-))

Actually, I'm planning to do the freelist thing, then the btree
compaction and then replace the current REINDEX code with the compaction
code, probably including some means to do REINDEX ALL.

It makes me really proud to hear such a note of confidence in my work.
Thank you very much.

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
Officer Krupke, what are we to do?
Gee, officer Krupke, Krup you! (West Side Story, "Gee, Officer Krupke")


Re: CLUSTER ALL syntax

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> Actually, I'm planning to do the freelist thing, then the btree
> compaction and then replace the current REINDEX code with the compaction
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> code, probably including some means to do REINDEX ALL.

Uh ... no.  The primary purpose of REINDEX is to recover from corrupted
indexes, so it has to be based on a rebuild strategy not a compaction
strategy.

If you want to add a REINDEX ALL for completeness, go ahead, but I think
the need for it will be vanishingly small once vacuum compacts btrees
properly.
        regards, tom lane


Re: pg_stat_database shows userid as OID

От
Bruce Momjian
Дата:
I totally agree with what you have said.  Peter, can you clarify your
reasoning for OID for user/group id?

---------------------------------------------------------------------------

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Tom Lane wrote:
> >> I'd recommend not making any piecemeal changes, especially not when
> >> there's not yet a consensus which way to converge.
> 
> > Well, seems we should make it consistent at least.
> 
> I think the original argument stemmed from the idea that we ought to use
> pg_shadow's OID column as the user identifier (eliminating usesysid per
> se).  This seems like a good idea at first but I think it has a couple
> of fatal problems:
>   * disappearance of pg_shadow.usesysid column will doubtless break some
>     applications
>   * if we use OID then it's much more difficult to support explicit
>     assignment of userid
> 
> > I think some wanted it to be an int so they could use the
> > same unix uid for pg_shadow, but I think we aren't using that idea much
> > anymore.
> 
> I don't think anyone worries about making usesysid match /etc/passwd
> anymore, but nonetheless CREATE USER WITH SYSID is still an essential
> capability.  What if you drop a user accidentally while he still owns
> objects?  You *must* be able to recreate him with the same sysid as
> before.  pg_depend cannot save us from this kind of mistake, either,
> since users span databases.
> 
> So it seems to me that we must keep pg_shadow.usesysid as a separate
> column and not try to make it the OID of pg_shadow.
> 
> Given that decision, the argument for making it be type OID seems very
> weak, so I'd lean to the "use int4" camp myself.  But I'm not sure
> everyone agrees.  I think Peter was strongly in favor of OID when he
> was revising the session-authorization code (that's why it all uses OID
> for user IDs...)
> 
> As far as the actual C code goes, I'd lean to creating new typedefs
> UserId and GroupId (or some such names) and making all the routine
> and variable declarations use those, and not either OID or int4.
> But I'm not excited about promoting these typedefs into actual SQL
> types, as was done for TransactionId and CommandId; the payback seems
> much less than the effort needed.
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
> 
> http://archives.postgresql.org
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: CLUSTER ALL syntax

От
Bruce Momjian
Дата:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > In looking at the CLUSTER ALL patch I have applied, I am now wondering
> > why the ALL keyword is used.  When we do VACUUM, we don't use ALL. 
> > VACUUM vacuums all tables.  Shouldn't' CLUSTER alone do the same thing. 
> 
> I agree, lose the ALL.

Good.  I can take care of that or someone can submit a patch.

> > And what about REINDEX?  That seems to have a different syntax from the
> > other two.  Seems there should be some consistency.
> 
> We don't have a REINDEX ALL, and I'm not in a hurry to invent one.
> (Especially, I'd not want to see Alvaro spending time on that instead
> of fixing the underlying btree-compaction problem ;-))

My point for REINDEX was a little different.  The man pages shows:
REINDEX { DATABASE | TABLE | INDEX } <replaceable    class="PARAMETER">name</replaceable> [ FORCE ]

where we don't have ALL but we do have DATABASE.  Do we need that
tri-valued secodn field for reindex because you can reindex a table _or_
and index, and hence DATABASE makes sense?  I am just asking.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: CLUSTER ALL syntax

От
Alvaro Herrera
Дата:
On Sun, Nov 17, 2002 at 06:43:38PM -0500, Bruce Momjian wrote:
> Tom Lane wrote:
> > Bruce Momjian <pgman@candle.pha.pa.us> writes:

> > > And what about REINDEX?  That seems to have a different syntax from the
> > > other two.  Seems there should be some consistency.
> > 
> > We don't have a REINDEX ALL, and I'm not in a hurry to invent one.
> > (Especially, I'd not want to see Alvaro spending time on that instead
> > of fixing the underlying btree-compaction problem ;-))
> 
> My point for REINDEX was a little different.  The man pages shows:
> 
>     REINDEX { DATABASE | TABLE | INDEX } <replaceable
>         class="PARAMETER">name</replaceable> [ FORCE ]
> 
> where we don't have ALL but we do have DATABASE.  Do we need that
> tri-valued secodn field for reindex because you can reindex a table _or_
> and index, and hence DATABASE makes sense?  I am just asking.

REINDEX DATABASE is for system indexes only, it's not the same that one
would think of REINDEX alone (which is all indexes on all tables, isn't
it?).

What I don't understand is what are the parameters in the
ReindexDatabase function for.  For example, the boolean all is always
false in tcop/utility.c (and there are no other places that the function
is called).  Also, the database name is checked to be equal to a
"constant" value, the database name that the standalone backend is
connected to.  Why are those useful?

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"No renuncies a nada. No te aferres a nada"


Re: CLUSTER ALL syntax

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> What I don't understand is what are the parameters in the
> ReindexDatabase function for.  For example, the boolean all is always
> false in tcop/utility.c (and there are no other places that the function
> is called).  Also, the database name is checked to be equal to a
> "constant" value, the database name that the standalone backend is
> connected to.  Why are those useful?

Well, passing all=true would implement REINDEX ALL ...

As for the database name, we could perhaps change the syntax to just
REINDEX DATABASE; not sure if it's worth the trouble.
        regards, tom lane


Re: CLUSTER ALL syntax

От
"Christopher Kings-Lynne"
Дата:
> In looking at the CLUSTER ALL patch I have applied, I am now wondering
> why the ALL keyword is used.  When we do VACUUM, we don't use ALL. 
> VACUUM vacuums all tables.  Shouldn't' CLUSTER alone do the same thing. 
> And what about REINDEX?  That seems to have a different syntax from the
> other two.  Seems there should be some consistency.

Yeah - I agree!

Chris



Re: CLUSTER ALL syntax

От
Hiroshi Inoue
Дата:
Alvaro Herrera wrote:
> 
> On Sun, Nov 17, 2002 at 06:43:38PM -0500, Bruce Momjian wrote:
> > Tom Lane wrote:
> > > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> 
> > > > And what about REINDEX?  That seems to have a different
> > > > syntax from the other two.  Seems there should be some > > > > consistency.
> > >
> > > We don't have a REINDEX ALL, and I'm not in a hurry to invent one.
> > > (Especially, I'd not want to see Alvaro spending time on that
> > > instead of fixing the underlying btree-compaction problem ;-))
> >
> > My point for REINDEX was a little different.  The man pages shows:
> >
> >       REINDEX { DATABASE | TABLE | INDEX } <replaceable
> >               class="PARAMETER">name</replaceable> [ FORCE ]
> >
> > where we don't have ALL but we do have DATABASE.  Do we need that
> > tri-valued secodn field for reindex because you can reindex a
> > table _or_ and index, and hence DATABASE makes sense?  I am just
> > asking.
> 
> REINDEX DATABASE is for system indexes only, it's not the same that one
> would think of REINDEX alone (which is all indexes on all tables, isn't
> it?).

Probably You don't understand the initial purpose of REINDEX.
It isn't an SQL standard at all and was intended to recover
corrupted system indexes. It's essentially an unsafe operation
and so the operation was inhibited other than under standalone
postgres. I also made the command a little hard to use to avoid
unexpected invocations e.g. REINDEX DATABASE requires an unnecessary
database name parameter or FORCE is still needed though it's a
requisite parameter now.

REINDEX is also used to compact indexes now. It's good but
the purpose is different from the initial one and we would
have to reorganize the functionalities e.g. the table data
isn't needed to compact the indexes etc.  
> What I don't understand is what are the parameters in the
> ReindexDatabase function for.  For example, the boolean all
> is always false in tcop/utility.c (and there are no other
> places that the function is called). 

I intended to implement the *true* case also then
but haven't done it yet, sorry.

regards,
Hiroshi Inouehttp://w2422.nsk.ne.jp/~inoue/


Re: CLUSTER ALL syntax

От
Bruce Momjian
Дата:
Alvaro Herrera wrote:
> On Sun, Nov 17, 2002 at 04:42:01PM -0500, Tom Lane wrote:
> > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > > In looking at the CLUSTER ALL patch I have applied, I am now wondering
> > > why the ALL keyword is used.  When we do VACUUM, we don't use ALL. 
> > > VACUUM vacuums all tables.  Shouldn't' CLUSTER alone do the same thing. 
> > 
> > I agree, lose the ALL.
> 
> Well, in my original patch (the one submitted just when 7.3 was going
> into beta) there was no ALL.  I decided to put it in for subsequent
> patches for no good reason.

I have updated CVS to make the syntax CLUSTER rather than CLUSTER ALL.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: pg_stat_database shows userid as OID

От
Bruce Momjian
Дата:
Does anyone want userid to be an OID?  Peter?  Anyone?

If not, I will add it to the TODO list or work on the patch myself.

---------------------------------------------------------------------------

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Tom Lane wrote:
> >> I'd recommend not making any piecemeal changes, especially not when
> >> there's not yet a consensus which way to converge.
> 
> > Well, seems we should make it consistent at least.
> 
> I think the original argument stemmed from the idea that we ought to use
> pg_shadow's OID column as the user identifier (eliminating usesysid per
> se).  This seems like a good idea at first but I think it has a couple
> of fatal problems:
>   * disappearance of pg_shadow.usesysid column will doubtless break some
>     applications
>   * if we use OID then it's much more difficult to support explicit
>     assignment of userid
> 
> > I think some wanted it to be an int so they could use the
> > same unix uid for pg_shadow, but I think we aren't using that idea much
> > anymore.
> 
> I don't think anyone worries about making usesysid match /etc/passwd
> anymore, but nonetheless CREATE USER WITH SYSID is still an essential
> capability.  What if you drop a user accidentally while he still owns
> objects?  You *must* be able to recreate him with the same sysid as
> before.  pg_depend cannot save us from this kind of mistake, either,
> since users span databases.
> 
> So it seems to me that we must keep pg_shadow.usesysid as a separate
> column and not try to make it the OID of pg_shadow.
> 
> Given that decision, the argument for making it be type OID seems very
> weak, so I'd lean to the "use int4" camp myself.  But I'm not sure
> everyone agrees.  I think Peter was strongly in favor of OID when he
> was revising the session-authorization code (that's why it all uses OID
> for user IDs...)
> 
> As far as the actual C code goes, I'd lean to creating new typedefs
> UserId and GroupId (or some such names) and making all the routine
> and variable declarations use those, and not either OID or int4.
> But I'm not excited about promoting these typedefs into actual SQL
> types, as was done for TransactionId and CommandId; the payback seems
> much less than the effort needed.
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
> 
> http://archives.postgresql.org
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073