Обсуждение: our checks for read-only queries are not great

Поиск
Список
Период
Сортировка

our checks for read-only queries are not great

От
Robert Haas
Дата:
I spent some time studying the question of how we classify queries as
either read-only or not, and our various definitions of read-only, and
found some bugs. Specifically:

1. check_xact_readonly() prohibits most kinds of DDL in read-only
transactions, but a bunch of recently-added commands were not added to
the list. The missing node types are T_CreatePolicyStmt,
T_AlterPolicyStmt, T_CreateAmStmt, T_CreateStatsStmt,
T_AlterStatsStmt, and T_AlterCollationStmt, which means you can run
these commands in a read-only transaction with no problem and even
attempt to run them on a standby. The ones I tested on a standby all
fail with random-ish error messages due to lower-level checks, but
that's not a great situation.

2. There are comments in utility.c which assert that certain commands
are "forbidden in parallel mode due to CommandIsReadOnly." That's
technically correct, but silly and misleading.These commands wouldn't
be running in parallel mode unless they were running inside of a
function or procedure or something, and, if they are,
CommandIsReadOnly() checks in spi.c or functions.c would prevent not
only these commands but, in fact, all utility commands, so calling out
those particular ones is just adding confusion. Also, the underlying
restriction is unnecessary, because there's no good reason to prevent
the use of things like SHOW and DO in parallel mode, yet we currently
do.

The problems mentioned under (1) are technically the fault of the
people who wrote, reviewed, and committed the patches which added
those command types, and the problems mentioned under (2) are
basically my fault, dating back to the original ParallelContext patch.
However, I think that all of them can be tracked back to a more
fundamental underlying cause, which is that the way that the various
restrictions on read-write queries are implemented is pretty
confusing. Attached is a patch I wrote to try to improve things. It
centralizes three decisions that are currently made in different
places in a single place: (a) can this be run in a read only
transaction? (b) can it run in parallel mode? (c) can it run on a
standby? -- and along the way, it fixes the problems mentioned above
and tries to supply slightly improved comments. Perhaps we should
back-patch fixes at least for (1) even if this gets committed, but I
guess my first question is what people think of this approach to the
problem.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вложения

Re: our checks for read-only queries are not great

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> I spent some time studying the question of how we classify queries as
> either read-only or not, and our various definitions of read-only, and
> found some bugs. ...
> However, I think that all of them can be tracked back to a more
> fundamental underlying cause, which is that the way that the various
> restrictions on read-write queries are implemented is pretty
> confusing. Attached is a patch I wrote to try to improve things.

Hmm.  I like the idea of deciding this in one place and insisting that
that one place have a switch case for every statement type.  That'll
address the root issue that people fail to think about this when adding
new statements.

I'm less enamored of some of the specific decisions here.  Notably

* I find COMMAND_IS_WEAKLY_READ_ONLY to be a more confusing concept
than what it replaces.  The test for LockStmt is an example --- the
comment talks about restricting locks during recovery, which is fine and
understandable, but then it's completely unobvious that the actual code
implements that behavior rather than some other one.

* ALTER SYSTEM SET is readonly?  Say what?  Even if that's how the current
code behaves, it's a damfool idea and we should change it.  I think that
the semantics we are really trying to implement for read-only is "has no
effects visible outside the current session" --- this explains, for
example, why copying into a temp table is OK.  ALTER SYSTEM SET certainly
isn't that.

I haven't read all of the code; those were just a couple points that
jumped out at me.

I think if we can sort out the notation for how the restrictions
are expressed, this'll be a good improvement.

            regards, tom lane



Re: our checks for read-only queries are not great

От
Robert Haas
Дата:
On Wed, Jan 8, 2020 at 2:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Hmm.  I like the idea of deciding this in one place and insisting that
> that one place have a switch case for every statement type.  That'll
> address the root issue that people fail to think about this when adding
> new statements.

Right. Assuming they test their new command at least one time, they
should notice.  :-)

> I'm less enamored of some of the specific decisions here.  Notably
>
> * I find COMMAND_IS_WEAKLY_READ_ONLY to be a more confusing concept
> than what it replaces.  The test for LockStmt is an example --- the
> comment talks about restricting locks during recovery, which is fine and
> understandable, but then it's completely unobvious that the actual code
> implements that behavior rather than some other one.

Uh, suggestions?

> * ALTER SYSTEM SET is readonly?  Say what?  Even if that's how the current
> code behaves, it's a damfool idea and we should change it.  I think that
> the semantics we are really trying to implement for read-only is "has no
> effects visible outside the current session" --- this explains, for
> example, why copying into a temp table is OK.  ALTER SYSTEM SET certainly
> isn't that.

It would be extremely lame and a huge usability regression to
arbitrary restrict ALTER SYSTEM SET on standby nodes for no reason.
Editing the postgresql.auto.conf file works just fine there, and is a
totally sensible thing to want to do. You could argue for restricting
it in parallel mode just out of general paranoia, but I don't favor
that approach.

> I think if we can sort out the notation for how the restrictions
> are expressed, this'll be a good improvement.

Thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: our checks for read-only queries are not great

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Jan 8, 2020 at 2:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> * I find COMMAND_IS_WEAKLY_READ_ONLY to be a more confusing concept
>> than what it replaces.  The test for LockStmt is an example --- the
>> comment talks about restricting locks during recovery, which is fine and
>> understandable, but then it's completely unobvious that the actual code
>> implements that behavior rather than some other one.

> Uh, suggestions?

COMMAND_NOT_IN_RECOVERY, maybe?

>> * ALTER SYSTEM SET is readonly?  Say what?

> It would be extremely lame and a huge usability regression to
> arbitrary restrict ALTER SYSTEM SET on standby nodes for no reason.

I didn't say that it shouldn't be allowed on standby nodes.  I said
it shouldn't be allowed in transactions that have explicitly declared
themselves to be read-only.  Maybe we need to disaggregate those
concepts a bit more --- a refactoring such as this would be a fine
time to do that.

            regards, tom lane



Re: our checks for read-only queries are not great

От
Robert Haas
Дата:
On Wed, Jan 8, 2020 at 3:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Wed, Jan 8, 2020 at 2:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> * I find COMMAND_IS_WEAKLY_READ_ONLY to be a more confusing concept
> >> than what it replaces.  The test for LockStmt is an example --- the
> >> comment talks about restricting locks during recovery, which is fine and
> >> understandable, but then it's completely unobvious that the actual code
> >> implements that behavior rather than some other one.
>
> > Uh, suggestions?
>
> COMMAND_NOT_IN_RECOVERY, maybe?

Well, maybe I should just get rid of COMMAND_IS_WEAKLY_READ_ONLY and
return individual COMMAND_OK_IN_* flags for those cases.

> >> * ALTER SYSTEM SET is readonly?  Say what?
>
> > It would be extremely lame and a huge usability regression to
> > arbitrary restrict ALTER SYSTEM SET on standby nodes for no reason.
>
> I didn't say that it shouldn't be allowed on standby nodes.

Oh, OK. I guess I misunderstood.

> I said
> it shouldn't be allowed in transactions that have explicitly declared
> themselves to be read-only.  Maybe we need to disaggregate those
> concepts a bit more --- a refactoring such as this would be a fine
> time to do that.

Yeah, the current rules are pretty weird. Aside from ALTER SYSTEM ..
SET, read-only transaction seem to allow writes to temporary relations
and sequences, plus CLUSTER, REINDEX, VACUUM, PREPARE, ROLLBACK
PREPARED, and COMMIT PREPARED, all of which sound a lot like writes to
me. They also allow LISTEN and SET which are have transactional
behavior in general but for some reason don't feel they need to
respect the R/O property. I worry that if we start whacking these
behaviors around we'll get complaints, so I'm cautious about doing
that. At the least, we would need to have a real clear definition, and
if there is such a definition that covers the current cases, I can't
guess what it is. Forget ALTER SYSTEM for a minute -- why is it OK to
rewrite a table via CLUSTER in a R/O transaction, but not OK to do an
ALTER TABLE that changes the clustering index? Why is it not OK to
LISTEN on a standby (and presumably not get any notifications until a
promotion occurs) but OK to UNLISTEN? Whatever reasons may have
justified the current choice of behaviors are probably lost in the
sands of time; they are for sure unknown to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: our checks for read-only queries are not great

От
Stephen Frost
Дата:
Greetings,

(I'm also overall in favor of the direction this is going, so general +1
from me, and I took a quick look through the patch and didn't
particularly see anything I didn't like besides what's commented on
below.)

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, Jan 8, 2020 at 3:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Robert Haas <robertmhaas@gmail.com> writes:
> > > On Wed, Jan 8, 2020 at 2:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >> * I find COMMAND_IS_WEAKLY_READ_ONLY to be a more confusing concept
> > >> than what it replaces.  The test for LockStmt is an example --- the
> > >> comment talks about restricting locks during recovery, which is fine and
> > >> understandable, but then it's completely unobvious that the actual code
> > >> implements that behavior rather than some other one.
> >
> > > Uh, suggestions?
> >
> > COMMAND_NOT_IN_RECOVERY, maybe?
>
> Well, maybe I should just get rid of COMMAND_IS_WEAKLY_READ_ONLY and
> return individual COMMAND_OK_IN_* flags for those cases.

Yeah, I don't like the WEAKLY_READ_ONLY idea either- explicitly having
COMMAND_OK_IN_X seems cleaner.

> > >> * ALTER SYSTEM SET is readonly?  Say what?
> >
> > > It would be extremely lame and a huge usability regression to
> > > arbitrary restrict ALTER SYSTEM SET on standby nodes for no reason.
> >
> > I didn't say that it shouldn't be allowed on standby nodes.
>
> Oh, OK. I guess I misunderstood.

I agree that we want ALTER SYSTEM SET to work on standbys, but it seems
there isn't really disagreement there.

> > I said
> > it shouldn't be allowed in transactions that have explicitly declared
> > themselves to be read-only.  Maybe we need to disaggregate those
> > concepts a bit more --- a refactoring such as this would be a fine
> > time to do that.
>
> Yeah, the current rules are pretty weird. Aside from ALTER SYSTEM ..
> SET, read-only transaction seem to allow writes to temporary relations
> and sequences, plus CLUSTER, REINDEX, VACUUM, PREPARE, ROLLBACK
> PREPARED, and COMMIT PREPARED, all of which sound a lot like writes to
> me. They also allow LISTEN and SET which are have transactional
> behavior in general but for some reason don't feel they need to
> respect the R/O property. I worry that if we start whacking these
> behaviors around we'll get complaints, so I'm cautious about doing
> that. At the least, we would need to have a real clear definition, and
> if there is such a definition that covers the current cases, I can't
> guess what it is. Forget ALTER SYSTEM for a minute -- why is it OK to
> rewrite a table via CLUSTER in a R/O transaction, but not OK to do an
> ALTER TABLE that changes the clustering index? Why is it not OK to
> LISTEN on a standby (and presumably not get any notifications until a
> promotion occurs) but OK to UNLISTEN? Whatever reasons may have
> justified the current choice of behaviors are probably lost in the
> sands of time; they are for sure unknown to me.

That a 'read-only' transaction can call CLUSTER is definitely bizarre to
me.  As relates to 'UN-SOMETHING', having those be allowed makes sense,
to me anyway, since connection poolers like to do those things and it
should be a no-op more-or-less by definition.  SET isn't changing data
blocks, so that also seems ok for a read-only transaction..  but, yeah,
there's no real great hard-and-fast-rule we've been following.

Would we be able to make a rule of "can't change on-disk stuff, except
for things in temporary schemas" and have it stick without a lot of
complaints?  Seems like that would address Tom's ALTER SYSTEM SET
concern, and would mean CLUSTER/REINDEX/VACUUM are disallowed in a
backwards-incompatible way (though I think I'm fine with that..), and
SET would still be allowed (which strikes me as correct too).  I'm not
quite sure how I feel about LISTEN, but that it could possibly actually
be used post-promotion and doesn't change on-disk stuff makes me feel
like it actually probably should be allowed.

Just looking at what was mentioned here- if there's other cases where
this idea falls flat then let's discuss them.

Thanks,

Stephen

Вложения

Re: our checks for read-only queries are not great

От
Robert Haas
Дата:
On Wed, Jan 8, 2020 at 5:57 PM Stephen Frost <sfrost@snowman.net> wrote:
> Yeah, I don't like the WEAKLY_READ_ONLY idea either- explicitly having
> COMMAND_OK_IN_X seems cleaner.

Done that way. v2 attached.

> Would we be able to make a rule of "can't change on-disk stuff, except
> for things in temporary schemas" and have it stick without a lot of
> complaints?  Seems like that would address Tom's ALTER SYSTEM SET
> concern, and would mean CLUSTER/REINDEX/VACUUM are disallowed in a
> backwards-incompatible way (though I think I'm fine with that..), and
> SET would still be allowed (which strikes me as correct too).  I'm not
> quite sure how I feel about LISTEN, but that it could possibly actually
> be used post-promotion and doesn't change on-disk stuff makes me feel
> like it actually probably should be allowed.

I think we can make any rule we like, but I think we should have some
measure of broad agreement on it. I'd like to go ahead with this for
now and then further changes can continue to be discussed and debated.
Hopefully we'll get a few more people to weigh in, too, because
deciding something like this based on what a handful of people doesn't
seem like a good idea to me.

I'd be really interested to hear if anyone knows the history behind
allowing CLUSTER, REINDEX, VACUUM, and some operations on temp tables.
It seems to have been that way for a long time. I wonder if it was a
deliberate choice or something that just happened semi-accidentally.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вложения

Re: our checks for read-only queries are not great

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> I'd be really interested to hear if anyone knows the history behind
> allowing CLUSTER, REINDEX, VACUUM, and some operations on temp tables.
> It seems to have been that way for a long time. I wonder if it was a
> deliberate choice or something that just happened semi-accidentally.

Within a "read-only" xact you mean?  I believe that allowing DML writes
was intentional.  As for the utility commands, I suspect that it was in
part accidental (error of omission?), and then if anyone thought hard
about it they decided that allowing DML writes to temp tables justifies
those operations too.

Have you tried excavating in our git history to see when the relevant
permission tests originated?

            regards, tom lane



Re: our checks for read-only queries are not great

От
Robert Haas
Дата:
On Thu, Jan 9, 2020 at 2:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > I'd be really interested to hear if anyone knows the history behind
> > allowing CLUSTER, REINDEX, VACUUM, and some operations on temp tables.
> > It seems to have been that way for a long time. I wonder if it was a
> > deliberate choice or something that just happened semi-accidentally.
>
> Within a "read-only" xact you mean?  I believe that allowing DML writes
> was intentional.  As for the utility commands, I suspect that it was in
> part accidental (error of omission?), and then if anyone thought hard
> about it they decided that allowing DML writes to temp tables justifies
> those operations too.
>
> Have you tried excavating in our git history to see when the relevant
> permission tests originated?

check_xact_readonly() with a long list of command tags originated in
the same commit that added read-only transactions. CLUSTER, REINDEX,
and VACUUM weren't included in the list of prohibited operations then,
either, but it's unclear whether that was a deliberate omission or an
oversight. That commit also thought that COPY FROM - and queries -
should allow temp tables. But there's nothing in the commit that seems
to explain why, unless the commit message itself is a hint:

commit b65cd562402ed9d3206d501cc74dc38bc421b2ce
Author: Peter Eisentraut <peter_e@gmx.net>
Date:   Fri Jan 10 22:03:30 2003 +0000

    Read-only transactions, as defined in SQL.

Maybe the SQL standard has something to say about this?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: our checks for read-only queries are not great

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> Maybe the SQL standard has something to say about this?

[ pokes around ... ]  Yeah, it does, and I'd say it's pretty clearly
in agreement with what Peter did, so far as DML ops go.  For instance,
this bit from SQL99's description of DELETE:

         1) If the access mode of the current SQL-transaction or the access
            mode of the branch of the current SQL-transaction at the current
            SQL-connection is read-only, and T is not a temporary table,
            then an exception condition is raised: invalid transaction state
            - read-only SQL-transaction.

UPDATE and INSERT say the same.  (I didn't look at later spec versions,
since Peter's 2003 commit was probably based on SQL99.)

You could argue about exactly how to extend that to non-spec
utility commands, but for the most part allowing them seems
to make sense if DML is allowed.

            regards, tom lane



Re: our checks for read-only queries are not great

От
Robert Haas
Дата:
On Thu, Jan 9, 2020 at 3:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > Maybe the SQL standard has something to say about this?
>
> [ pokes around ... ]  Yeah, it does, and I'd say it's pretty clearly
> in agreement with what Peter did, so far as DML ops go.  For instance,
> this bit from SQL99's description of DELETE:
>
>          1) If the access mode of the current SQL-transaction or the access
>             mode of the branch of the current SQL-transaction at the current
>             SQL-connection is read-only, and T is not a temporary table,
>             then an exception condition is raised: invalid transaction state
>             - read-only SQL-transaction.
>
> UPDATE and INSERT say the same.  (I didn't look at later spec versions,
> since Peter's 2003 commit was probably based on SQL99.)

OK. That's good to know.

> You could argue about exactly how to extend that to non-spec
> utility commands, but for the most part allowing them seems
> to make sense if DML is allowed.

But I think we allow them on all tables, not just temp tables, so I
don't think I understand this argument.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: our checks for read-only queries are not great

От
Robert Haas
Дата:
On Thu, Jan 9, 2020 at 3:37 PM Robert Haas <robertmhaas@gmail.com> wrote:
> > You could argue about exactly how to extend that to non-spec
> > utility commands, but for the most part allowing them seems
> > to make sense if DML is allowed.
>
> But I think we allow them on all tables, not just temp tables, so I
> don't think I understand this argument.

Oh, wait: I'm conflating two things. The current behavior extends the
spec behavior to COPY in a logical way.

But it also allows CLUSTER, REINDEX, and VACUUM on any table. The spec
presumably has no view on that, nor does the passage you quoted seem
to apply here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: our checks for read-only queries are not great

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Jan 9, 2020 at 3:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> You could argue about exactly how to extend that to non-spec
>> utility commands, but for the most part allowing them seems
>> to make sense if DML is allowed.

> But I think we allow them on all tables, not just temp tables, so I
> don't think I understand this argument.

Oh, I misunderstood your concern.

Peter might remember more clearly, but I have a feeling that we
concluded that the intent of the spec was for read-only-ness to
disallow globally-visible changes in the visible database contents.
VACUUM, for example, does not cause any visible change, so it
should be admissible.  REINDEX ditto.  (We ignore here the possibility
of such things causing, say, a change in the order in which rows are
returned, since that's beneath the spec's notice to begin with.)
ANALYZE ditto, except to the extent that if you look at pg_stats
you might see something different --- but again, system catalog
contents are outside the spec's purview.

You could extend this line of argument, perhaps, far enough to justify
ALTER SYSTEM SET as well.  But I don't like that because some GUCs have
visible effects on the results that an ordinary query minding its own
business can get.  Timezone is perhaps the poster child there, or
search_path.  If we were to subdivide the GUCs into "affects
implementation details only" vs "can affect query semantics",
I'd hold still for allowing ALTER SYSTEM SET on the former group.
Doubt it's worth the trouble to distinguish, though.

            regards, tom lane



Re: our checks for read-only queries are not great

От
Peter Eisentraut
Дата:
On 2020-01-09 21:52, Tom Lane wrote:
> Peter might remember more clearly, but I have a feeling that we
> concluded that the intent of the spec was for read-only-ness to
> disallow globally-visible changes in the visible database contents.

I don't really remember, but that was basically the opinion I had 
arrived at as I was reading through this current thread.  Roughly 
speaking, anything that changes the database state (data or schema) in a 
way that would be reflected in a pg_dump output is not read-only.

> VACUUM, for example, does not cause any visible change, so it
> should be admissible.  REINDEX ditto.  (We ignore here the possibility
> of such things causing, say, a change in the order in which rows are
> returned, since that's beneath the spec's notice to begin with.)

agreed

> ANALYZE ditto, except to the extent that if you look at pg_stats
> you might see something different --- but again, system catalog
> contents are outside the spec's purview.

agreed

> You could extend this line of argument, perhaps, far enough to justify
> ALTER SYSTEM SET as well.  But I don't like that because some GUCs have
> visible effects on the results that an ordinary query minding its own
> business can get.  Timezone is perhaps the poster child there, or
> search_path.  If we were to subdivide the GUCs into "affects
> implementation details only" vs "can affect query semantics",
> I'd hold still for allowing ALTER SYSTEM SET on the former group.
> Doubt it's worth the trouble to distinguish, though.

ALTER SYSTEM is read only in my mind.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: our checks for read-only queries are not great

От
Robert Haas
Дата:
On Fri, Jan 10, 2020 at 7:23 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> I don't really remember, but that was basically the opinion I had
> arrived at as I was reading through this current thread.  Roughly
> speaking, anything that changes the database state (data or schema) in a
> way that would be reflected in a pg_dump output is not read-only.

This rule very nearly matches the current behavior: it explains why
temp table operations are allowed, and why ALTER SYSTEM is allowed,
and why REINDEX etc. are allowed. However, there's a notable
exception: PREPARE, COMMIT PREPARED, and ROLLBACK PREPARED are allowed
in a read-only transaction. Under the "doesn't change pg_dump output"
criteria, the first and third ones should be permitted but COMMIT
PREPARED should be denied, except maybe if the prepared transaction
didn't do any writes (and in that case, why did we bother preparing
it?). Despite that, this rule does a way better job explaining the
current behavior than anything else suggested so far.

Accordingly, here's v3, with comments adjusted to match this new
explanation for the current behavior. This seems way better than what
I had before, because it actually explains why stuff is the way it is
rather than just appealing to history.

BTW, there's a pending patch that allows CLUSTER to change the
tablespace of an object while rewriting it. If we want to be strict
about it, that variant would need to be disallowed in read only mode,
under this definition. (I also think that it's lousy syntax and ought
to be spelled ALTER TABLE x SET TABLESPACE foo, CLUSTER or something
like that rather than anything beginning with CLUSTER, but I seem to
be losing that argument.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вложения

Re: our checks for read-only queries are not great

От
Tom Lane
Дата:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> I don't really remember, but that was basically the opinion I had 
> arrived at as I was reading through this current thread.  Roughly 
> speaking, anything that changes the database state (data or schema) in a 
> way that would be reflected in a pg_dump output is not read-only.

OK, although I'd put some emphasis on "roughly speaking".

> ALTER SYSTEM is read only in my mind.

I'm still having trouble with this conclusion.  I think it can only
be justified by a very narrow reading of "reflected in pg_dump"
that relies on the specific factorization we've chosen for upgrade
operations, ie that postgresql.conf mods have to be carried across
by hand.  But that's mostly historical baggage, rather than a sane
basis for defining "read only".  If somebody comes up with a patch
that causes "pg_dumpall -g" to include ALTER SYSTEM SET commands for
whatever is in postgresql.auto.conf (not an unreasonable idea BTW),
will you then decide that ALTER SYSTEM SET is no longer read-only?
Or, perhaps, reject such a patch on the grounds that it breaks this
arbitrary definition of read-only-ness?

As another example, do we need to consider that replacing pg_hba.conf
via pg_write_file should be allowed in a "read only" transaction?

These conclusions seem obviously silly to me, but perhaps YMMV.

            regards, tom lane



Re: our checks for read-only queries are not great

От
Robert Haas
Дата:
On Fri, Jan 10, 2020 at 9:30 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> If somebody comes up with a patch
> that causes "pg_dumpall -g" to include ALTER SYSTEM SET commands for
> whatever is in postgresql.auto.conf (not an unreasonable idea BTW),
> will you then decide that ALTER SYSTEM SET is no longer read-only?
> Or, perhaps, reject such a patch on the grounds that it breaks this
> arbitrary definition of read-only-ness?

I would vote to reject such a patch as a confused muddle. I mean,
generally, the expectation right now is that if you move your data
from the current cluster to a new one by pg_dump, pg_upgrade, or even
by promoting a standby, you're responsible for making sure that
postgresql.conf and postgresql.auto.conf get copied over separately.
In the last case, the backup that created the standby will have copied
the postgresql.conf from the master as it existed at that time, but
propagating any subsequent changes is up to you. Now, if we now decide
to shove ALTER SYSTEM SET commands into pg_dumpall output, then
suddenly you're changing that rule, and it's not very clear what the
new rule is.

Now, our current approach is fairly arguable. Given that GUCs on
databases, users, functions, etc. are stored in the catalogs and
subject to backup, restore, replication, etc., one might well expect
that global settings would be handled the same way. I tend to think
that would be nicer, though it would require solving the problem of
how to back out bad changes that make the database not start up.
Regardless of what you or anybody thinks about that, though, it's not
how it works today and would require some serious engineering if we
wanted to make it happen.

> As another example, do we need to consider that replacing pg_hba.conf
> via pg_write_file should be allowed in a "read only" transaction?

I don't really see what the problem with that is. It bothers me a lot
more that CLUSTER can be run in a read-only transaction -- which
actually changes stuff inside the database, even if not necessarily in
a user-visible way -- than it does that somebody might be able to use
the database to change something that isn't really part of the
database anyway. And pg_hba.conf, like postgresql.conf, is largely
treated as an input to the database rather than part of it.

Somebody could create a user-defined function that launches a
satellite into orbit and that is, I would argue, a write operation in
the truest sense. You have changed the state of the world in a lasting
way, and you cannot take it back. But, it's not writing to the
database, so as far as read-only transactions are concerned, who
cares?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: our checks for read-only queries are not great

От
Stephen Frost
Дата:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Fri, Jan 10, 2020 at 9:30 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > If somebody comes up with a patch
> > that causes "pg_dumpall -g" to include ALTER SYSTEM SET commands for
> > whatever is in postgresql.auto.conf (not an unreasonable idea BTW),
> > will you then decide that ALTER SYSTEM SET is no longer read-only?
> > Or, perhaps, reject such a patch on the grounds that it breaks this
> > arbitrary definition of read-only-ness?
>
> I would vote to reject such a patch as a confused muddle. I mean,
> generally, the expectation right now is that if you move your data
> from the current cluster to a new one by pg_dump, pg_upgrade, or even
> by promoting a standby, you're responsible for making sure that
> postgresql.conf and postgresql.auto.conf get copied over separately.

I really don't like that the ALTER SYSTEM SET/postgresql.auto.conf stuff
has to be handled in some way external to logical export/import, or
external to pg_upgrade (particularly in link mode), so I generally see
where Tom's coming from with that suggestion.

In general, I don't think people should be expected to hand-muck around
with anything in the data directory.

> In the last case, the backup that created the standby will have copied
> the postgresql.conf from the master as it existed at that time, but
> propagating any subsequent changes is up to you. Now, if we now decide
> to shove ALTER SYSTEM SET commands into pg_dumpall output, then
> suddenly you're changing that rule, and it's not very clear what the
> new rule is.

I'd like a rule of "users don't muck with the data directory", and we
are nearly there when you include sensible packaging such as what Debian
provides- by moving postrgesql.conf, log files, etc, outside of the data
directory.  For things that can't be moved out of the data directory
though, like postgresql.auto.conf, we should be handling those
transparently to the user.

I agree that there are some interesting cases to consider here though-
like doing a pg_dumpall against a standby resulting in something
different than if you did it against the primary because the
postgresql.auto.conf is different between them (something that I'm
generally supportive of having, and it seems everyone else is too).  I
think having an option to control if the postgresql.auto.conf settings
are included or not in the pg_dumpall output would be a reasonable way
to deal with that though.

> Now, our current approach is fairly arguable. Given that GUCs on
> databases, users, functions, etc. are stored in the catalogs and
> subject to backup, restore, replication, etc., one might well expect
> that global settings would be handled the same way. I tend to think
> that would be nicer, though it would require solving the problem of
> how to back out bad changes that make the database not start up.
> Regardless of what you or anybody thinks about that, though, it's not
> how it works today and would require some serious engineering if we
> wanted to make it happen.

This sounds an awful lot like the arguments that I tried to make when
ALTER SYSTEM SET was first going in, but what's done is done and there's
not much to do but make the best of it as I can't imagine there'd be
much support for ripping it out.

I don't really agree about the need for 'some serious engineering'
either, but having an option for it, sure.

I do also tend to agree with Tom about making ALTER SYSTEM SET be
prohibited in explicitly read-only transactions, but still allowing it
to be run against replicas as that's a handy thing to be able to do.

> > As another example, do we need to consider that replacing pg_hba.conf
> > via pg_write_file should be allowed in a "read only" transaction?
>
> I don't really see what the problem with that is. It bothers me a lot
> more that CLUSTER can be run in a read-only transaction -- which
> actually changes stuff inside the database, even if not necessarily in
> a user-visible way -- than it does that somebody might be able to use
> the database to change something that isn't really part of the
> database anyway. And pg_hba.conf, like postgresql.conf, is largely
> treated as an input to the database rather than part of it.

I don't like that CLUSTER can be run in a read-only transaction either
(though it seems like downthread maybe some people are fine with
that..).  I'm also coming around to the idea that pg_write_file()
probably shouldn't be allowed either, and probably not COPY TO either
(except to stdout, since that's a result, not a change operation).

> Somebody could create a user-defined function that launches a
> satellite into orbit and that is, I would argue, a write operation in
> the truest sense. You have changed the state of the world in a lasting
> way, and you cannot take it back. But, it's not writing to the
> database, so as far as read-only transactions are concerned, who
> cares?

I suppose there's another thing to think about in this discussion, which
are FDWs, if the idea is that read-only means "I don't want to make
changes in THIS database".  I don't really feel like that's what marking
a transaction as 'read only' is intended to mean though.

When I think of starting a read-only transaction, I feel like it's
usually with the idea of "I want to play it safe and I don't want this
transaction to make ANY changes".  I'm feeling more inclined that we
should be going out of our way to make darn sure that we respect that
request of the user, no matter what it is they're running.  We can't
prevent user-created C-level functions from launching satellites, but
that's an untrusted language and therefore it's up to the function
author to manage the transaction and privilege system properly anyway,
not ours.

Thanks,

Stephen

Вложения

Re: our checks for read-only queries are not great

От
Laurenz Albe
Дата:
On Fri, 2020-01-10 at 09:29 -0500, Tom Lane wrote:
> > ALTER SYSTEM is read only in my mind.
> 
> I'm still having trouble with this conclusion.  I think it can only
> be justified by a very narrow reading of "reflected in pg_dump"
> that relies on the specific factorization we've chosen for upgrade
> operations, ie that postgresql.conf mods have to be carried across
> by hand.  But that's mostly historical baggage, rather than a sane
> basis for defining "read only".  If somebody comes up with a patch
> that causes "pg_dumpall -g" to include ALTER SYSTEM SET commands for
> whatever is in postgresql.auto.conf (not an unreasonable idea BTW),
> will you then decide that ALTER SYSTEM SET is no longer read-only?

I think that having ALTER SYSTEM commands in pg_dumpall output
would be a problem.  It would cause all kinds of problems whenever
parameters change.  Thinking of the transition "checkpoint_segments"
-> "max_wal_size", you'd have to build some translation magic into pg_dump.
Besides, such a feature would make it harder to restore a dump taken
with version x into version x + n for n > 0.

> Or, perhaps, reject such a patch on the grounds that it breaks this
> arbitrary definition of read-only-ness?

I agree with Robert that such a patch should be rejected on other
grounds.

Concerning the topic of the thread, I personally have come to think
that changing GUCs is *not* writing to the database.  But that is based
on the fact that you can change GUCs on streaming replication standbys,
and it may be surprising to a newcomer.

Perhaps it would be good to consider this question:
Do we call something "read-only" if it changes nothing, or do we call it
"read-only" if it is allowed on a streaming replication standby?
The first would be more correct, but the second may be more convenient.

Yours,
Laurenz Albe




Re: our checks for read-only queries are not great

От
Tom Lane
Дата:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> Perhaps it would be good to consider this question:
> Do we call something "read-only" if it changes nothing, or do we call it
> "read-only" if it is allowed on a streaming replication standby?
> The first would be more correct, but the second may be more convenient.

Yeah, this is really the larger point at stake.  I'm not sure that
"read-only" and "allowed on standby" should be identical, nor even
that one should be an exact subset of the other.  They're certainly
by-and-large the same sets of operations, but there might be
exceptions that belong to only one set.  "read-only" is driven by
(some reading of) the SQL standard, while "allowed on standby" is
driven by implementation limitations, so I think it'd be dangerous
to commit ourselves to those being identical.

            regards, tom lane



Re: our checks for read-only queries are not great

От
Peter Eisentraut
Дата:
On 2020-01-10 14:41, Robert Haas wrote:
> This rule very nearly matches the current behavior: it explains why
> temp table operations are allowed, and why ALTER SYSTEM is allowed,
> and why REINDEX etc. are allowed. However, there's a notable
> exception: PREPARE, COMMIT PREPARED, and ROLLBACK PREPARED are allowed
> in a read-only transaction. Under the "doesn't change pg_dump output"
> criteria, the first and third ones should be permitted but COMMIT
> PREPARED should be denied, except maybe if the prepared transaction
> didn't do any writes (and in that case, why did we bother preparing
> it?). Despite that, this rule does a way better job explaining the
> current behavior than anything else suggested so far.

I don't follow.  Does pg_dump dump prepared transactions?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: our checks for read-only queries are not great

От
Stephen Frost
Дата:
Greetings,

* Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> On Fri, 2020-01-10 at 09:29 -0500, Tom Lane wrote:
> > > ALTER SYSTEM is read only in my mind.
> >
> > I'm still having trouble with this conclusion.  I think it can only
> > be justified by a very narrow reading of "reflected in pg_dump"
> > that relies on the specific factorization we've chosen for upgrade
> > operations, ie that postgresql.conf mods have to be carried across
> > by hand.  But that's mostly historical baggage, rather than a sane
> > basis for defining "read only".  If somebody comes up with a patch
> > that causes "pg_dumpall -g" to include ALTER SYSTEM SET commands for
> > whatever is in postgresql.auto.conf (not an unreasonable idea BTW),
> > will you then decide that ALTER SYSTEM SET is no longer read-only?
>
> I think that having ALTER SYSTEM commands in pg_dumpall output
> would be a problem.  It would cause all kinds of problems whenever
> parameters change.  Thinking of the transition "checkpoint_segments"
> -> "max_wal_size", you'd have to build some translation magic into pg_dump.
> Besides, such a feature would make it harder to restore a dump taken
> with version x into version x + n for n > 0.

pg_dump already specifically has understanding of how to deal with old
options in other things when constructing a dump for a given version-
and we already have issues that a dump taken with pg_dump X has a good
chance of now being able to be restoreding into a PG X+1, that's why
it's recommended to use the pg_dump for the version of PG you're
intending to restore into, so I don't particularly agree with any of the
arguments presented above.

> > Or, perhaps, reject such a patch on the grounds that it breaks this
> > arbitrary definition of read-only-ness?
>
> I agree with Robert that such a patch should be rejected on other
> grounds.
>
> Concerning the topic of the thread, I personally have come to think
> that changing GUCs is *not* writing to the database.  But that is based
> on the fact that you can change GUCs on streaming replication standbys,
> and it may be surprising to a newcomer.
>
> Perhaps it would be good to consider this question:
> Do we call something "read-only" if it changes nothing, or do we call it
> "read-only" if it is allowed on a streaming replication standby?
> The first would be more correct, but the second may be more convenient.

The two are distinct from each other and one doesn't imply the other.  I
don't think we need to, or really want to, force them to be the same.

When we're talking about a "read-only" transaction that the user has
specifically asked be "read-only" then, imv anyway, we should be pretty
stringent regarding what that's allowed to do and shouldn't be allowing
that to change state in the system which other processes will see after
the transaction is over.

A transaction (on a primary or a replica) doesn't need to be started as
explicitly "read-only" and perhaps we should change the language when we
are starting up to say "database is ready to accept replica connections"
or something instead of "read-only" connections to clarify that.

Thanks,

Stephen

Вложения

Re: our checks for read-only queries are not great

От
Robert Haas
Дата:
On Mon, Jan 13, 2020 at 5:57 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 2020-01-10 14:41, Robert Haas wrote:
> > This rule very nearly matches the current behavior: it explains why
> > temp table operations are allowed, and why ALTER SYSTEM is allowed,
> > and why REINDEX etc. are allowed. However, there's a notable
> > exception: PREPARE, COMMIT PREPARED, and ROLLBACK PREPARED are allowed
> > in a read-only transaction. Under the "doesn't change pg_dump output"
> > criteria, the first and third ones should be permitted but COMMIT
> > PREPARED should be denied, except maybe if the prepared transaction
> > didn't do any writes (and in that case, why did we bother preparing
> > it?). Despite that, this rule does a way better job explaining the
> > current behavior than anything else suggested so far.
>
> I don't follow.  Does pg_dump dump prepared transactions?

No, but committing one changes the database contents as seen by a
subsequent pg_dump.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: our checks for read-only queries are not great

От
Laurenz Albe
Дата:
On Mon, 2020-01-13 at 13:56 -0500, Stephen Frost wrote:
> > I think that having ALTER SYSTEM commands in pg_dumpall output
> > would be a problem.  It would cause all kinds of problems whenever
> > parameters change.  Thinking of the transition "checkpoint_segments"
> > -> "max_wal_size", you'd have to build some translation magic into pg_dump.
> > Besides, such a feature would make it harder to restore a dump taken
> > with version x into version x + n for n > 0.
> 
> pg_dump already specifically has understanding of how to deal with old
> options in other things when constructing a dump for a given version-
> and we already have issues that a dump taken with pg_dump X has a good
> chance of now being able to be restoreding into a PG X+1, that's why
> it's recommended to use the pg_dump for the version of PG you're
> intending to restore into, so I don't particularly agree with any of the
> arguments presented above.

Right.
But increasing the difficulty of restoring a version x pg_dump with
version x + 1 is still not a thing we should lightly do.

Not that the docs currently say "it is recommended to use pg_dumpall
from the newer version".  They don't say "is is not supported".

Yours,
Laurenz Albe




Re: our checks for read-only queries are not great

От
Stephen Frost
Дата:
Greetings,

* Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> On Mon, 2020-01-13 at 13:56 -0500, Stephen Frost wrote:
> > > I think that having ALTER SYSTEM commands in pg_dumpall output
> > > would be a problem.  It would cause all kinds of problems whenever
> > > parameters change.  Thinking of the transition "checkpoint_segments"
> > > -> "max_wal_size", you'd have to build some translation magic into pg_dump.
> > > Besides, such a feature would make it harder to restore a dump taken
> > > with version x into version x + n for n > 0.
> >
> > pg_dump already specifically has understanding of how to deal with old
> > options in other things when constructing a dump for a given version-
> > and we already have issues that a dump taken with pg_dump X has a good
> > chance of now being able to be restoreding into a PG X+1, that's why
> > it's recommended to use the pg_dump for the version of PG you're
> > intending to restore into, so I don't particularly agree with any of the
> > arguments presented above.
>
> Right.
> But increasing the difficulty of restoring a version x pg_dump with
> version x + 1 is still not a thing we should lightly do.

I've never heard that and I don't agree with it being a justification
for blocking sensible progress.

> Not that the docs currently say "it is recommended to use pg_dumpall
> from the newer version".  They don't say "is is not supported".

It's recommended due to exactly the reasons presented and no one is
saying that such isn't supported- but we don't and aren't going to
guarantee that it's going to work.  We absolutely know of cases where it
just won't work, today.  If that's justification for saying it's not
supported, then fine, let's change the language to say that.

Thanks,

Stephen

Вложения

Re: our checks for read-only queries are not great

От
Robert Haas
Дата:
On Mon, Jan 13, 2020 at 3:00 PM Stephen Frost <sfrost@snowman.net> wrote:
> I've never heard that and I don't agree with it being a justification
> for blocking sensible progress.

Speaking of sensible progress, I think we've drifted off on a tangent
here about ALTER SYSTEM. As I understand it, nobody's opposed to the
most recent version (v3) of the proposed patch, which also makes no
definitional changes relative to the status quo, but does fix some
bugs, and makes things a little nicer for parallel query, too. So I'd
like to go ahead and commit that.

Discussing of what to do about ALTER SYSTEM can continue, although I
feel perhaps the current discussion isn't particularly productive.  On
the one hand, the argument that it isn't read only because it writes
data someplace doesn't convince me: practically every command can
cause some kind of write some place, e.g. SELECT can write WAL for at
least 2 different reasons, and that does not make it not read-only,
nor does the fact that it updates the table statistics. The question
of what data is being written must be relevant. On the other hand, I'm
unpersuaded by the arguments so far that including ALTER SYSTEM
commands in pg_dump output would be anything other than a train wreck,
though doing it optionally and not by default might be OK. However,
the main thing for me is that messing around with the behavior of
ALTER SYSTEM in either of those ways or some other is not what this
patch is about. I'm just proposing to refactor the code to fix the
existing bugs and make it much less likely that future patches will
create new ones, and I think reclassifying or redesigning ALTER SYSTEM
ought to be done, if at all, separately.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: our checks for read-only queries are not great

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> Speaking of sensible progress, I think we've drifted off on a tangent
> here about ALTER SYSTEM.

Agreed, that's not terribly relevant for the proposed patch.

            regards, tom lane



Re: our checks for read-only queries are not great

От
Stephen Frost
Дата:
Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > Speaking of sensible progress, I think we've drifted off on a tangent
> > here about ALTER SYSTEM.
>
> Agreed, that's not terribly relevant for the proposed patch.

I agree that the proposed patch seems alright by itself, as the changes
it's making to existing behavior seem to all be bug-fixes and pretty
clear improvements not really related to 'read-only' transactions.

It's unfortunate that we haven't been able to work through to some kind
of agreement around what "SET TRANSACTION READ ONLY" means, so that
users of it can know what to expect.

Thanks,

Stephen

Вложения

Re: our checks for read-only queries are not great

От
Peter Eisentraut
Дата:
On 2020-01-13 20:14, Robert Haas wrote:
> On Mon, Jan 13, 2020 at 5:57 AM Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>> On 2020-01-10 14:41, Robert Haas wrote:
>>> This rule very nearly matches the current behavior: it explains why
>>> temp table operations are allowed, and why ALTER SYSTEM is allowed,
>>> and why REINDEX etc. are allowed. However, there's a notable
>>> exception: PREPARE, COMMIT PREPARED, and ROLLBACK PREPARED are allowed
>>> in a read-only transaction. Under the "doesn't change pg_dump output"
>>> criteria, the first and third ones should be permitted but COMMIT
>>> PREPARED should be denied, except maybe if the prepared transaction
>>> didn't do any writes (and in that case, why did we bother preparing
>>> it?). Despite that, this rule does a way better job explaining the
>>> current behavior than anything else suggested so far.
>>
>> I don't follow.  Does pg_dump dump prepared transactions?
> 
> No, but committing one changes the database contents as seen by a
> subsequent pg_dump.

Well, if the transaction was declared read-only, then committing it 
(directly or 2PC) shouldn't change anything.  This appears to be a 
circular argument.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: our checks for read-only queries are not great

От
Robert Haas
Дата:
On Wed, Jan 15, 2020 at 10:25 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> Well, if the transaction was declared read-only, then committing it
> (directly or 2PC) shouldn't change anything.  This appears to be a
> circular argument.

I don't think it's a circular argument. Suppose that someone decrees
that, as of 5pm Eastern time, no more read-write transactions are
permitted. And because the person issuing the decree has a lot of
power, everybody obeys. Now, every pg_dump taken after that time will
be semantically equivalent to every other pg_dump taken after that
time, with one tiny exception. That exception is that someone could
still do COMMIT PREPARED of a read-write transaction that was prepared
before 5pm. If the goal of the powerful person who issued the decree
was to make sure that the database couldn't change - e.g. so they
could COPY each table individually without keeping a snapshot open and
still get a consistent backup - they might fail to achieve it if, as
of the moment of the freeze, there are some prepared write
transactions.

I'm not saying we have to change the behavior or anything. I'm just
saying that there seems to be one, and only one, way to make the
apparent contents of the database change in a read-only transaction
right now. And that's a COMMIT PREPARED of a read-write transaction.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: our checks for read-only queries are not great

От
Stephen Frost
Дата:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, Jan 15, 2020 at 10:25 AM Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
> > Well, if the transaction was declared read-only, then committing it
> > (directly or 2PC) shouldn't change anything.  This appears to be a
> > circular argument.
>
> I don't think it's a circular argument. Suppose that someone decrees
> that, as of 5pm Eastern time, no more read-write transactions are
> permitted. And because the person issuing the decree has a lot of
> power, everybody obeys. Now, every pg_dump taken after that time will
> be semantically equivalent to every other pg_dump taken after that
> time, with one tiny exception. That exception is that someone could
> still do COMMIT PREPARED of a read-write transaction that was prepared
> before 5pm. If the goal of the powerful person who issued the decree
> was to make sure that the database couldn't change - e.g. so they
> could COPY each table individually without keeping a snapshot open and
> still get a consistent backup - they might fail to achieve it if, as
> of the moment of the freeze, there are some prepared write
> transactions.
>
> I'm not saying we have to change the behavior or anything. I'm just
> saying that there seems to be one, and only one, way to make the
> apparent contents of the database change in a read-only transaction
> right now. And that's a COMMIT PREPARED of a read-write transaction.

Yeah, allowing a read-only transaction to start and then commit a
read-write transaction doesn't seem sensible.  I'd be in favor of
changing that.

Thanks,

Stephen

Вложения

Re: our checks for read-only queries are not great

От
Robert Haas
Дата:
On Tue, Jan 14, 2020 at 1:46 PM Stephen Frost <sfrost@snowman.net> wrote:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
> > Robert Haas <robertmhaas@gmail.com> writes:
> > > Speaking of sensible progress, I think we've drifted off on a tangent
> > > here about ALTER SYSTEM.
> >
> > Agreed, that's not terribly relevant for the proposed patch.
>
> I agree that the proposed patch seems alright by itself, as the changes
> it's making to existing behavior seem to all be bug-fixes and pretty
> clear improvements not really related to 'read-only' transactions.

There seems to be no disagreement on this point, so I have committed the patch.

> It's unfortunate that we haven't been able to work through to some kind
> of agreement around what "SET TRANSACTION READ ONLY" means, so that
> users of it can know what to expect.

I at least feel like we have a pretty good handle on what it was
intended to mean; that is, "doesn't cause semantically significant
changes to pg_dump output." I do hear some skepticism as to whether
that's the best definition, but it has pretty good explanatory power
relative to the current state of the code, which is something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: our checks for read-only queries are not great

От
Stephen Frost
Дата:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Tue, Jan 14, 2020 at 1:46 PM Stephen Frost <sfrost@snowman.net> wrote:
> > * Tom Lane (tgl@sss.pgh.pa.us) wrote:
> > > Robert Haas <robertmhaas@gmail.com> writes:
> > > > Speaking of sensible progress, I think we've drifted off on a tangent
> > > > here about ALTER SYSTEM.
> > >
> > > Agreed, that's not terribly relevant for the proposed patch.
> >
> > I agree that the proposed patch seems alright by itself, as the changes
> > it's making to existing behavior seem to all be bug-fixes and pretty
> > clear improvements not really related to 'read-only' transactions.
>
> There seems to be no disagreement on this point, so I have committed the patch.

Works for me.

> > It's unfortunate that we haven't been able to work through to some kind
> > of agreement around what "SET TRANSACTION READ ONLY" means, so that
> > users of it can know what to expect.
>
> I at least feel like we have a pretty good handle on what it was
> intended to mean; that is, "doesn't cause semantically significant
> changes to pg_dump output." I do hear some skepticism as to whether
> that's the best definition, but it has pretty good explanatory power
> relative to the current state of the code, which is something.

I think I agree with you regarding the original intent, though even
there, as discussed elsewhere, it seems like there's perhaps either a
bug or a disagreement about the specifics of what that means when it
relates to committing a 2-phase transaction.  Still, setting that aside
for the moment, do we feel like this is enough to be able to update our
documentation with?

Thanks,

Stephen

Вложения

Re: our checks for read-only queries are not great

От
Robert Haas
Дата:
On Thu, Jan 16, 2020 at 12:22 PM Stephen Frost <sfrost@snowman.net> wrote:
> I think I agree with you regarding the original intent, though even
> there, as discussed elsewhere, it seems like there's perhaps either a
> bug or a disagreement about the specifics of what that means when it
> relates to committing a 2-phase transaction.  Still, setting that aside
> for the moment, do we feel like this is enough to be able to update our
> documentation with?

I think that would be possible. What did you have in mind?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: our checks for read-only queries are not great

От
Bruce Momjian
Дата:
On Mon, Jan 13, 2020 at 01:56:30PM -0500, Stephen Frost wrote:
> > I think that having ALTER SYSTEM commands in pg_dumpall output
> > would be a problem.  It would cause all kinds of problems whenever
> > parameters change.  Thinking of the transition "checkpoint_segments"
> > -> "max_wal_size", you'd have to build some translation magic into pg_dump.
> > Besides, such a feature would make it harder to restore a dump taken
> > with version x into version x + n for n > 0.
> 
> pg_dump already specifically has understanding of how to deal with old
> options in other things when constructing a dump for a given version-
> and we already have issues that a dump taken with pg_dump X has a good
> chance of now being able to be restoreding into a PG X+1, that's why
> it's recommended to use the pg_dump for the version of PG you're
> intending to restore into, so I don't particularly agree with any of the
> arguments presented above.

One issue is that system table GUC settings (e.g., per-database,
per-user) cannot include postgresql.conf-only settings, like
max_wal_size, so system tables GUC settings are less likely to be
renamed than postgresql.conf.auto settings.  FYI, we are more inclined
to allow postgresql.conf-only changes than others because there is less
impact on applications.

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +



Re: our checks for read-only queries are not great

От
Stephen Frost
Дата:
Greetings,

* Bruce Momjian (bruce@momjian.us) wrote:
> On Mon, Jan 13, 2020 at 01:56:30PM -0500, Stephen Frost wrote:
> > > I think that having ALTER SYSTEM commands in pg_dumpall output
> > > would be a problem.  It would cause all kinds of problems whenever
> > > parameters change.  Thinking of the transition "checkpoint_segments"
> > > -> "max_wal_size", you'd have to build some translation magic into pg_dump.
> > > Besides, such a feature would make it harder to restore a dump taken
> > > with version x into version x + n for n > 0.
> >
> > pg_dump already specifically has understanding of how to deal with old
> > options in other things when constructing a dump for a given version-
> > and we already have issues that a dump taken with pg_dump X has a good
> > chance of now being able to be restoreding into a PG X+1, that's why
> > it's recommended to use the pg_dump for the version of PG you're
> > intending to restore into, so I don't particularly agree with any of the
> > arguments presented above.
>
> One issue is that system table GUC settings (e.g., per-database,
> per-user) cannot include postgresql.conf-only settings, like
> max_wal_size, so system tables GUC settings are less likely to be
> renamed than postgresql.conf.auto settings.  FYI, we are more inclined
> to allow postgresql.conf-only changes than others because there is less
> impact on applications.

I'm a bit unclear about what's being suggested here.  When you are
talking about 'applications', are you referring specifically to pg_dump
and pg_restore, or are you talking about regular user applications?

If you're referring to pg_dump/restore, then what I'm understanding from
your comments is that if we made pg_dump/restore aware of ALTER SYSTEM
and were made to support it that we would then be less inclined to
change the names of postgresql.conf-only settings because, if we do so,
we have to update pg_dump/restore.

I can see some argument in that direction but my initial reaction is
that I don't feel like the bar would really be moved very far, and, if
we came up with some mapping from one to the other for those, I actually
think it'd be really helpful downstream for packagers and such who
routinely are dealing with updating from an older postgresql.conf file
to a newer one when an upgrade is done.

Thanks,

Stephen

Вложения

Re: our checks for read-only queries are not great

От
Bruce Momjian
Дата:
On Mon, Jan 27, 2020 at 02:26:42PM -0500, Stephen Frost wrote:
> Greetings,
> 
> * Bruce Momjian (bruce@momjian.us) wrote:
> > On Mon, Jan 13, 2020 at 01:56:30PM -0500, Stephen Frost wrote:
> > > > I think that having ALTER SYSTEM commands in pg_dumpall output
> > > > would be a problem.  It would cause all kinds of problems whenever
> > > > parameters change.  Thinking of the transition "checkpoint_segments"
> > > > -> "max_wal_size", you'd have to build some translation magic into pg_dump.
> > > > Besides, such a feature would make it harder to restore a dump taken
> > > > with version x into version x + n for n > 0.
> > > 
> > > pg_dump already specifically has understanding of how to deal with old
> > > options in other things when constructing a dump for a given version-
> > > and we already have issues that a dump taken with pg_dump X has a good
> > > chance of now being able to be restoreding into a PG X+1, that's why
> > > it's recommended to use the pg_dump for the version of PG you're
> > > intending to restore into, so I don't particularly agree with any of the
> > > arguments presented above.
> > 
> > One issue is that system table GUC settings (e.g., per-database,
> > per-user) cannot include postgresql.conf-only settings, like
> > max_wal_size, so system tables GUC settings are less likely to be
> > renamed than postgresql.conf.auto settings.  FYI, we are more inclined
> > to allow postgresql.conf-only changes than others because there is less
> > impact on applications.
> 
> I'm a bit unclear about what's being suggested here.  When you are
> talking about 'applications', are you referring specifically to pg_dump
> and pg_restore, or are you talking about regular user applications?

Sorry for the late reply.  I meant all applications.

> If you're referring to pg_dump/restore, then what I'm understanding from
> your comments is that if we made pg_dump/restore aware of ALTER SYSTEM
> and were made to support it that we would then be less inclined to
> change the names of postgresql.conf-only settings because, if we do so,
> we have to update pg_dump/restore.
> 
> I can see some argument in that direction but my initial reaction is
> that I don't feel like the bar would really be moved very far, and, if
> we came up with some mapping from one to the other for those, I actually
> think it'd be really helpful downstream for packagers and such who
> routinely are dealing with updating from an older postgresql.conf file
> to a newer one when an upgrade is done.

I should have given more examples.  Changing GUC variables like
search_path or work_mem, which can be set in per-database, per-user, and
per-session contexts, is more disruptive than changing GUCs that can
only can be set in postgresql.conf, like max_wal_size.  My point is that
all GUC changes don't have the same level of disruption.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +