Обсуждение: Add a new table for Transaction Isolation?

Поиск

Список

Период

Сортировка

Add a new table for Transaction Isolation?

От

"David G. Johnston"

Дата:

14 апреля 2015 г., 06:00:48

http://www.postgresql.org/docs/9.4/static/transaction-iso.html

Table 13-1 shows the SQL standard isolation levels and what is and is not guaranteed. Then the text goes on to explain how our implementation differs from that table. Is there any opposition to actually adding a similar table, 13-2, probably right after the paragraph, with the same columns, three rows, and the corresponding possible/not-possible cell values?

David J.

Re: Add a new table for Transaction Isolation?

От

Bruce Momjian

Дата:

16 апреля 2015 г., 04:21:36

On Mon, Apr 13, 2015 at 08:00:38PM -0700, David G. Johnston wrote:
> http://www.postgresql.org/docs/9.4/static/transaction-iso.html
>
> Table 13-1 shows the SQL standard isolation levels and what is and is not
> guaranteed.  Then the text goes on to explain how our implementation differs
> from that table.  Is there any opposition to actually adding a similar table,
> 13-2, probably right after the paragraph, with the same columns, three rows,
> and the corresponding possible/not-possible cell values?

Yes, it does make sense to have a table that properly matches the
Postgres implementation.   Should I write a patch or would you like to?

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Re: Add a new table for Transaction Isolation?

От

"David G. Johnston"

Дата:

16 апреля 2015 г., 04:26:05

On Wednesday, April 15, 2015, Bruce Momjian <bruce@momjian.us> wrote:

On Mon, Apr 13, 2015 at 08:00:38PM -0700, David G. Johnston wrote:
> http://www.postgresql.org/docs/9.4/static/transaction-iso.html
>
> Table 13-1 shows the SQL standard isolation levels and what is and is not
> guaranteed. Then the text goes on to explain how our implementation differs
> from that table. Is there any opposition to actually adding a similar table,
> 13-2, probably right after the paragraph, with the same columns, three rows,
> and the corresponding possible/not-possible cell values?

Yes, it does make sense to have a table that properly matches the
Postgres implementation. Should I write a patch or would you like to?

I'll take a crack at it.

David J.

Re: Add a new table for Transaction Isolation?

От

"David G. Johnston"

Дата:

18 апреля 2015 г., 02:36:46

A bit of scope creep due to wanting to point out the obvious "RR and SER" are the same observation on the table. The main body for SER covers the fact as well though in a very technical way.

I thought pointing out that examples are on the Wiki would be useful as well - not everyone would think to go there for additional information. No like though - just a pointer to it or the Internet generally.

It is not obvious to me what <table tocentry="1"> means...I suspect 1=yes...

David J.

On Wed, Apr 15, 2015 at 6:26 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:

On Wednesday, April 15, 2015, Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Apr 13, 2015 at 08:00:38PM -0700, David G. Johnston wrote:
> http://www.postgresql.org/docs/9.4/static/transaction-iso.html
>
> Table 13-1 shows the SQL standard isolation levels and what is and is not
> guaranteed. Then the text goes on to explain how our implementation differs
> from that table. Is there any opposition to actually adding a similar table,
> 13-2, probably right after the paragraph, with the same columns, three rows,
> and the corresponding possible/not-possible cell values?

Yes, it does make sense to have a table that properly matches the
Postgres implementation. Should I write a patch or would you like to?

I'll take a crack at it.

David J.

Вложения

mvcc-isolationlevels-v1.diff

Re: Add a new table for Transaction Isolation?

От

Peter Eisentraut

Дата:

24 апреля 2015 г., 19:57:21

On 4/17/15 7:36 PM, David G. Johnston wrote:
> diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
> index f88b16e..5002138 100644
> --- a/doc/src/sgml/mvcc.sgml
> +++ b/doc/src/sgml/mvcc.sgml
> @@ -100,6 +100,14 @@
>      phenomena caused by interactions?)
>     </para>
>
> +  <para>
> +   The concepts covered in this section are
> +   presented without examples of the behaviors described.  The internet,
> +   including and espcially the <productname>PostgreSQL</productname> Wiki, is
> +   an excellent resource to learn more about circumstances under which these
> +   data phenomena occur, and what the results look like when they do.
> +  </para>
> +

I don't think our documentation should go out of its way to say, "our
documentation is bad, look elsewhere".  If we think examples are
necessary, then we should add some.  Otherwise, it's implied that
improvement is always possible.

>     <para>
>      The phenomena which are prohibited at various levels are:
>
> @@ -150,12 +158,12 @@
>      <indexterm>
>       <primary>transaction isolation level</primary>
>      </indexterm>
> -    The four transaction isolation levels and the corresponding
> -    behaviors are described in <xref linkend="mvcc-isolevel-table">.
> +    The four SQL transaction isolation levels, and their corresponding
> +    behaviors, are described in <xref linkend="mvcc-isolevel-table">.
>     </para>

I don't think this change is good.

>
>      <table tocentry="1" id="mvcc-isolevel-table">
> -     <title>Standard <acronym>SQL</acronym> Transaction Isolation Levels</title>
> +     <title><acronym>SQL</acronym> Standard Transaction Isolation Levels</title>
>       <tgroup cols="4">
>        <thead>
>         <row>

Why this change?

> @@ -256,6 +264,89 @@
>     </para>
>
>     <para>
> +    The three <productname>PostgreSQL</productname> transaction isolation levels, and their corresponding
> +    behaviors, are described in <xref linkend="mvcc-pgsql-isolevel-table">.
> +   </para>

This isn't really correct.  The PostgreSQL isolation levels were
described in the paragraph above.  The table is really just a summary of
the previous explanation.

> +   <para>
> +    As the table makes clear there is no difference in the potential phenomena
> +    at the REPEATABLE READ and SERIALIZABLE transaction isolation levels; but
> +    the phenomena listed only pertain to the data seen by the transaction.

Please adapt the existing spelling and capitalization.

> +    The difference is that REPEATABLE READ will only serial-fail

This term "serial-fail" would need further explanation.

> +    if two transactions attempt to modify the same record while SERIALIZABLE will
> +    also serial-fail if one transaction modifies a record that another transaction
> +    has only read.
> +   </para>

I don't think this new table adds clarity.  Users should generally have
their applications use the appropriate standard isolation level.  Adding
another table that says, some of these are not actually different,
following by text that says they are different in other ways, just
repeats the point that was made earlier and will be explained in more
detail in the following subsections.

The real difference, in my mind, is that the SQL standard defines four
levels in terms of three criteria, but PostgreSQL really has four
criteria and only three different levels implemented.  It might be worth
visualizing that somehow.

Note that when that section was initially written, the fourth criterion
(serializability) wasn't implemented.

Re: Add a new table for Transaction Isolation?

От

"David G. Johnston"

Дата:

24 апреля 2015 г., 21:01:32

On Fri, Apr 24, 2015 at 9:57 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

On 4/17/15 7:36 PM, David G. Johnston wrote:
> diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
> index f88b16e..5002138 100644
> --- a/doc/src/sgml/mvcc.sgml
> +++ b/doc/src/sgml/mvcc.sgml
> @@ -100,6 +100,14 @@
> phenomena caused by interactions?)
> </para>
>
> + <para>
> + The concepts covered in this section are
> + presented without examples of the behaviors described. The internet,
> + including and espcially the <productname>PostgreSQL</productname> Wiki, is
> + an excellent resource to learn more about circumstances under which these
> + data phenomena occur, and what the results look like when they do.
> + </para>
> +

I don't think our documentation should go out of its way to say, "our
documentation is bad, look elsewhere". If we think examples are
necessary, then we should add some. Otherwise, it's implied that
improvement is always possible.

I'm not - I am explicitly listing the assumptions the documentation makes regarding reader experience (and ease of documenting) - and pointing out were the reader can go if their experience is lacking in those areas. It seems unproductive to move all of the SSI content on our Wiki into the documentation and so, lacking such, we should point out where else content can be found.

> <para>
> The phenomena which are prohibited at various levels are:
>
> @@ -150,12 +158,12 @@
> <indexterm>
> <primary>transaction isolation level</primary>
> </indexterm>
> - The four transaction isolation levels and the corresponding
> - behaviors are described in <xref linkend="mvcc-isolevel-table">.
> + The four SQL transaction isolation levels, and their corresponding
> + behaviors, are described in <xref linkend="mvcc-isolevel-table">.
> </para>

I don't think this change is good.

I think it reads cleaner but not so much as to argue it.

>
> <table tocentry="1" id="mvcc-isolevel-table">
> - <title>Standard <acronym>SQL</acronym> Transaction Isolation Levels</title>
> + <title><acronym>SQL</acronym> Standard Transaction Isolation Levels</title>
> <tgroup cols="4">
> <thead>
> <row>

Why this change?

The new table reads "PostgreSQL ..." and the corresponding noun is the "SQL Standard". Writing "Standard SQL" can be read as implying the existence of "Non-Standard SQL..." which is not correct. Just saying "SQL..." seems to be too generic - though after reading the conclusion and pondering that "SQL Standard" could also imply "SQL Non-Standard..." I'm not so sure whether just saying SQL wouldn't be best. "Here are the four words that can be used with SET TRANSACTION ISOLATION LEVEL..." - and then show/describe the minimum required non-behaviors and the non-behaviors as implemented in PostgreSQL.

Maybe possessive would work "PostgreSQL's ..." and "SQL Standard's..."

> @@ -256,6 +264,89 @@
> </para>
>
> <para>
> + The three <productname>PostgreSQL</productname> transaction isolation levels, and their corresponding
> + behaviors, are described in <xref linkend="mvcc-pgsql-isolevel-table">.
> + </para>

This isn't really correct. The PostgreSQL isolation levels were
described in the paragraph above. The table is really just a summary of
the previous explanation.

"[...], are summarized in <xref...>" ?

> + <para>
> + As the table makes clear there is no difference in the potential phenomena
> + at the REPEATABLE READ and SERIALIZABLE transaction isolation levels; but
> + the phenomena listed only pertain to the data seen by the transaction.

Please adapt the existing spelling and capitalization.

> + The difference is that REPEATABLE READ will only serial-fail

This term "serial-fail" would need further explanation.

I will probably stick with the more verbose "serialization failure"...does that require explaining here, or a xref?

> + if two transactions attempt to modify the same record while SERIALIZABLE will
> + also serial-fail if one transaction modifies a record that another transaction
> + has only read.
> + </para>

I don't think this new table adds clarity. Users should generally have
their applications use the appropriate standard isolation level.

Then why not sure write the entire section relative to the standard and point out the differences between the standard and our implementation on the command definition page in the compatibility section?

http://www.postgresql.org/docs/devel/static/sql-set-transaction.html

Adding
another table that says, some of these are not actually different,
following by text that says they are different in other ways, just
repeats the point that was made earlier and will be explained in more
detail in the following subsections.

The real difference, in my mind, is that the SQL standard defines four
levels in terms of three criteria, but PostgreSQL really has four
criteria and only three different levels implemented. It might be worth
visualizing that somehow.

Well, I would at least add "Read uncommitted" to the PostgreSQL table and have it setup the same as "Read committed". We do implement all four - by name.

And, as to the prior point, visualizing the differences seems best accomplished in a compatibility section and likely will just confuse the issue here - if indeed the expectation is that users will define their requirements relative to the standard and not relative to our implementation.

Otherwise, a summary table describing our implementation seems like a self-evident need. We are already going to great lengths to describe everything in the table anyway and we already are using a table to describe the standard's definitions. Placing said table here seems easiest and if summarizing what is already present in the text somehow makes the section more confusing I posit that it must already be confusing without the table. At least this way the confusing stuff is summarized and is readily available for lookup by those who know what they are looking for.

For clarity - what is the 4th criteria that you are thinking of? More specifically, how would you name it so that it could be a table column?

Two separate patches here:

1) pointing out that additional information is available on the wiki and the internet

2) summarizing the PostgreSQL implementation into a table similar to that already present for the Standard

#2 can be implemented in the MVCC section or a more extensive patch can also update the SQL command SET TRANSACTION section - which will mean someone feels strongly enough that the status quo is better than updating MVCC while waiting for someone to write the more invasive patch.

David J.

Re: Add a new table for Transaction Isolation?

От

Kevin Grittner

Дата:

24 апреля 2015 г., 23:40:51

David G. Johnston <david.g.johnston@gmail.com> wrote:
> On Fri, Apr 24, 2015 at 9:57 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
>> On 4/17/15 7:36 PM, David G. Johnston wrote:

>>> +  <para>
>>> +   The concepts covered in this section are
>>> +   presented without examples of the behaviors described.  The internet,
>>> +   including and espcially the <productname>PostgreSQL</productname> Wiki, is
>>> +   an excellent resource to learn more about circumstances under which these
>>> +   data phenomena occur, and what the results look like when they do.
>>> +  </para>
>>
>> I don't think our documentation should go out of its way to say, "our
>> documentation is bad, look elsewhere".  If we think examples are
>> necessary, then we should add some.  Otherwise, it's implied that
>> improvement is always possible.
>
> I'm not - I am explicitly listing the assumptions the
> documentation makes regarding reader experience (and ease of
> documenting) - and pointing out were the reader can go if their
> experience is lacking in those areas.  It seems unproductive to
> move all of the SSI content on our Wiki into the documentation and
> so, lacking such, we should point out where else content can be
> found.

There have been suggestions before that some or all of the Wiki's
SSI page be brought into the docs, or that the docs reference it.
Bringing all of it in does seem like quite a lot for a single
feature like this.  I'm not sure what the best course is.

>>>      <table tocentry="1" id="mvcc-isolevel-table">
>>> -     <title>Standard <acronym>SQL</acronym> Transaction Isolation Levels</title>
>>> +     <title><acronym>SQL</acronym> Standard Transaction Isolation Levels</title>
>>>       <tgroup cols="4">
>>>        <thead>
>>>         <row>
>>
>> Why this change?
>
> The new table reads "PostgreSQL ..." and the corresponding noun is
> the "SQL Standard".  Writing "Standard SQL" can be read as implying
> the existence of "Non-Standard SQL..." which is not correct.  Just
> saying "SQL..." seems to be too generic - though after reading the
> conclusion and pondering that "SQL Standard" could also imply "SQL
> Non-Standard..." I'm not so sure whether just saying SQL wouldn't
> be best.  "Here are the four words that can be used with SET
> TRANSACTION ISOLATION LEVEL..." - and then show/describe the
> minimum required non-behaviors and the non-behaviors as implemented
> in PostgreSQL.
>
> Maybe possessive would work "PostgreSQL's ..." and "SQL Standard's..."

Personally I think that "standard SQL" means "SQL, as defined by
international standards documents."  I see no benefit to changes
along the lines suggested here.

>>>     <para>
>>> +    The three <productname>PostgreSQL</productname> transaction isolation levels, and their corresponding
>>> +    behaviors, are described in <xref linkend="mvcc-pgsql-isolevel-table">.
>>> +   </para>
>>
>> This isn't really correct.  The PostgreSQL isolation levels were
>> described in the paragraph above.  The table is really just a summary of
>> the previous explanation.
>
> "[...], are summarized in <xref...>" ?

The problem with tables like this is that sometimes people just
look at the table and assume that it is the *definition* of the
isolation levels.  At *no* point did *any* version of the SQL
standard *ever* define the serializable transaction isolation level
in terms of the phenomena shown in the table.  The definition has
always been:

| The execution of concurrent SQL-transactions at isolation level
| SERIALIZABLE is guaranteed to be serializable. A serializable
| execution is defined to be an execution of the operations of
| concurrently executing SQL-transactions that produces the same
| effect as some serial execution of those same SQL-transactions. A
| serial execution is one in which each SQL-transaction executes to
| completion before the next SQL-transaction begins.

Serializable transactions have been included in the table of which
phenomena are allowed to occur at which isolation levels; but the
table has always been followed by this note:

| The exclusion of these phenomena for SQL-transactions executing
| at isolation level SERIALIZABLE is a consequence of the
| requirement that such transactions be serializable.

Yet so many people have not looked beyond the table to see the
actual definition of "serializable" in the standard that the
absence of these three phenomena has often been mistakenly
considered adequate for compliance with the standard.  A 1995 paper
titled "A Critique of ANSI SQL Isolation Levels" by Berenson, et
al, notes this, saying:

| Subclause 4.28, “SQL-transactions”, in [ANSI] notes that the
| SERIALIZABLE isolation level must provide what is “commonly known
| as fully serializable execution.” The  prominence of the table
| compared to this extra proviso leads to a common misconception
| that disallowing the three phenomena implies serializability.

... and later observes:

| It would have been simpler [...] to drop [references to phantom
| reads] and just use Subclause 4.28 to define ANSI SERIALIZABLE.

I tend to agree.  Not only would it have been simpler, I think it
would have prevented a lot of misunderstanding of the requirements
of the standard.  Tables like this can do a lot more to promote
confusion and misunderstanding than clarity.  If we're going to
make a change here, I think rather than doubling down on the
standard's questionable inclusion of such a table by providing
*two* tables, we should consider removing the existing table.

> Then why not sure write the entire section relative to the
> standard and point out the differences between the standard and our
> implementation on the command definition page in the compatibility
> section?

Many people don't have access to the standard, the standard is
confusing to many, and the standard is specifically written to
specify minimum required behaviors rather than anything that is
dependent on implementation.  The standard does not say that the
READ UNCOMMITTED transaction isolation level allows other
transactions to see the uncommitted work of a transaction; it
merely says that no other transaction isolation level may do so.
The same is true with all the phenomena -- our implementation does
not "differ" from the standard on those points; it is in full
compliance with it.

> Otherwise, a summary table describing our implementation seems
> like a self-evident need.  We are already going to great lengths to
> describe everything in the table anyway and we already are using a
> table to describe the standard's definitions.  Placing said table
> here seems easiest and if summarizing what is already present in
> the text somehow makes the section more confusing I posit that it
> must already be confusing without the table.  At least this way the
> confusing stuff is summarized and is readily available for lookup
> by those who know what they are looking for.

But the table, by its nature, does not provide the full set of
information, and too many people just look at the table because
"it's easy".  The question seems to me to be whether providing an
easy way to get an inaccurate understanding of the topic has value;
I submit that the confusion caused by the table in the standard (in
spite of a note immediately after the table to try to prevent that)
shows that it is not.

> Two separate patches here:
>
> 1) pointing out that additional information is available on the
> wiki and the internet

That and/or bringing in one or more of the Wiki example.

> 2) summarizing the PostgreSQL implementation into a table similar
> to that already present for the Standard
>
> #2 can be implemented in the MVCC section or a more extensive patch
> can also update the SQL command SET TRANSACTION section - which
> will mean someone feels strongly enough that the status quo is
> better than updating MVCC while waiting for someone to write the
> more invasive patch.

And, for reasons given above, I really question whether such a
table doesn't do more harm than good.  Even those citing the paper
by Berenson, et al., often miss the text in *that* paper about what
the actual definition of serializable transactions in the standard
is, and instead focus on the quick-to-read tables of how the
misinterpretation of serializable transactions based on the
standard's table of phenomena (which the paper dubs "ANOMALY
SERIALIZABLE") differs from truly serializable behavior.

People do love tables like this, which makes providing them
tempting; but when a short, clean table is available they often
seem less inclined to take the trouble to read the real information
the table summarizes -- and they come away with distorted and
incorrect ideas about the subject matter.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add a new table for Transaction Isolation?

От

Bruce Momjian

Дата:

25 апреля 2015 г., 21:02:57

On Fri, Apr 24, 2015 at 08:40:40PM +0000, Kevin Grittner wrote:
> And, for reasons given above, I really question whether such a
> table doesn't do more harm than good.  Even those citing the paper
> by Berenson, et al., often miss the text in *that* paper about what
> the actual definition of serializable transactions in the standard
> is, and instead focus on the quick-to-read tables of how the
> misinterpretation of serializable transactions based on the
> standard's table of phenomena (which the paper dubs "ANOMALY
> SERIALIZABLE") differs from truly serializable behavior.
>
> People do love tables like this, which makes providing them
> tempting; but when a short, clean table is available they often
> seem less inclined to take the trouble to read the real information
> the table summarizes -- and they come away with distorted and
> incorrect ideas about the subject matter.

I don't think we can abandon the table --- people have enough trouble
figuring this out, including me, and without the table, it will be even
harder.

What I have done is to add two rows and one column to the table, and
changed the surrounding text to more clearly reference the table.  You
can see the output here, and the SGML patch is attached:

    http://momjian.us/expire/transaction-iso.html

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Вложения

isolation.diff

Re: Add a new table for Transaction Isolation?

От

"David G. Johnston"

Дата:

25 апреля 2015 г., 21:33:45

On Sat, Apr 25, 2015 at 11:02 AM, Bruce Momjian <bruce@momjian.us> wrote:

On Fri, Apr 24, 2015 at 08:40:40PM +0000, Kevin Grittner wrote:
> And, for reasons given above, I really question whether such a
> table doesn't do more harm than good. Even those citing the paper
> by Berenson, et al., often miss the text in *that* paper about what
> the actual definition of serializable transactions in the standard
> is, and instead focus on the quick-to-read tables of how the
> misinterpretation of serializable transactions based on the
> standard's table of phenomena (which the paper dubs "ANOMALY
> SERIALIZABLE") differs from truly serializable behavior.
>
> People do love tables like this, which makes providing them
> tempting; but when a short, clean table is available they often
> seem less inclined to take the trouble to read the real information
> the table summarizes -- and they come away with distorted and
> incorrect ideas about the subject matter.

I don't think we can abandon the table --- people have enough trouble
figuring this out, including me, and without the table, it will be even
harder.

What I have done is to add two rows and one column to the table, and
changed the surrounding text to more clearly reference the table. You
can see the output here, and the SGML patch is attached:

http://momjian.us/expire/transaction-iso.html

Need to add "Serialization Anomalies" to the previous section's definitions list.

Pondering whether something like: "Possible (not in PG)" and avoiding the additional rows would make reading the table easier.

David J.

Re: Add a new table for Transaction Isolation?

От

Bruce Momjian

Дата:

25 апреля 2015 г., 22:16:31

On Sat, Apr 25, 2015 at 11:33:36AM -0700, David G. Johnston wrote:
> On Sat, Apr 25, 2015 at 11:02 AM, Bruce Momjian <bruce@momjian.us> wrote:
>
>     On Fri, Apr 24, 2015 at 08:40:40PM +0000, Kevin Grittner wrote:
>     > And, for reasons given above, I really question whether such a
>     > table doesn't do more harm than good.  Even those citing the paper
>     > by Berenson, et al., often miss the text in *that* paper about what
>     > the actual definition of serializable transactions in the standard
>     > is, and instead focus on the quick-to-read tables of how the
>     > misinterpretation of serializable transactions based on the
>     > standard's table of phenomena (which the paper dubs "ANOMALY
>     > SERIALIZABLE") differs from truly serializable behavior.
>     >
>     > People do love tables like this, which makes providing them
>     > tempting; but when a short, clean table is available they often
>     > seem less inclined to take the trouble to read the real information
>     > the table summarizes -- and they come away with distorted and
>     > incorrect ideas about the subject matter.
>
>     I don't think we can abandon the table --- people have enough trouble
>     figuring this out, including me, and without the table, it will be even
>     harder.
>
>     What I have done is to add two rows and one column to the table, and
>     changed the surrounding text to more clearly reference the table.  You
>     can see the output here, and the SGML patch is attached:
>
>             http://momjian.us/expire/transaction-iso.html
>
>
> Need to add "Serialization Anomalies" to the previous section's definitions
> list.

Uh, I am afraid the problem is that "Serialization Anomalies" is kind of
defined by the standard in an odd way that is specific to serializable
mode, I think.  Kevin, is that true?

> Pondering whether something like: "Possible (not in PG)" and avoiding the
> additional rows would make reading the table easier.

Uh, that's an idea.  I thought visually having two separate lines was
cleaner.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Re: Add a new table for Transaction Isolation?

От

Kevin Grittner

Дата:

25 апреля 2015 г., 22:45:47

Bruce Momjian <bruce@momjian.us> wrote:
> On Sat, Apr 25, 2015 at 11:33:36AM -0700, David G. Johnston wrote:

>> Need to add "Serialization Anomalies" to the previous section's
>> definitions list.
>
> Uh, I am afraid the problem is that "Serialization Anomalies" is
> kind of defined by the standard in an odd way that is specific to
> serializable mode, I think.  Kevin, is that true?

They never use the word anomaly (or its plural) in the standard
(even though it is prevalent in the academic literature).  See my
earlier email for examples of how the standard describes the issue,
but basically it just boils down to saying that the effects of
concurrent execution of a set of serializable transactions must be
consistent with some one-at-a-time execution order.  We could
perhaps have the column header say "Non-Serializable Behavior" or
some such; but I think we need to define whatever term we use for
the new column header.

>> Pondering whether something like: "Possible (not in PG)" and
>> avoiding the additional rows would make reading the table
>> easier.
>
> Uh, that's an idea.  I thought visually having two separate lines
> was cleaner.

I think one row per transaction isolation level, with three
possible values per cell, would be the cleanest.  I have been
trying to think of alternatives for the three values, but have not
come up with anything better than David's suggestion.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add a new table for Transaction Isolation?

От

Bruce Momjian

Дата:

25 апреля 2015 г., 23:03:05

On Sat, Apr 25, 2015 at 07:45:35PM +0000, Kevin Grittner wrote:
> They never use the word anomaly (or its plural) in the standard
> (even though it is prevalent in the academic literature).  See my
> earlier email for examples of how the standard describes the issue,
> but basically it just boils down to saying that the effects of
> concurrent execution of a set of serializable transactions must be
> consistent with some one-at-a-time execution order.  We could
> perhaps have the column header say "Non-Serializable Behavior" or
> some such; but I think we need to define whatever term we use for
> the new column header.

I don't think we can define the column as a negative, e.g. "Non-".

> >> Pondering whether something like: "Possible (not in PG)" and
> >> avoiding the additional rows would make reading the table
> >> easier.
> >
> > Uh, that's an idea.  I thought visually having two separate lines
> > was cleaner.
>
> I think one row per transaction isolation level, with three
> possible values per cell, would be the cleanest.  I have been
> trying to think of alternatives for the three values, but have not
> come up with anything better than David's suggestion.

Well, then "Possible" would refer to the SQL standard behavior, which
seems kind of an odd thing to emphasize there.  The field really needs
to be "SQL-standard possible, PostgreSQL not possible", but that is too
long.  This is why I split it into separate lines.  We could try
"Possible (SQL standard), Not possible (PostgreSQL)".

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Re: Add a new table for Transaction Isolation?

От

Kevin Grittner

Дата:

25 апреля 2015 г., 23:47:55

Bruce Momjian <bruce@momjian.us> wrote:
> On Sat, Apr 25, 2015 at 07:45:35PM +0000, Kevin Grittner wrote:

>> We could perhaps have the column header say "Non-Serializable
>> Behavior" or some such; but I think we need to define whatever
>> term we use for the new column header.
>
> I don't think we can define the column as a negative, e.g.
> "Non-".

Yeah, that would tend to add to confusion.  The basic issue is
whether there are any one-at-a-time orders of execution that could
yield the same results, or whether there is a cycle in an attempt
to graph such an order.  "Cycles in Apparent Order of Execution"
would be accurate, but it's kinda long, and possibly too arcane.

>>>> Pondering whether something like: "Possible (not in PG)" and
>>>> avoiding the additional rows would make reading the table
>>>> easier.
>>>
>>> Uh, that's an idea.  I thought visually having two separate
>>> lines was cleaner.
>>
>> I think one row per transaction isolation level, with three
>> possible values per cell, would be the cleanest.  I have been
>> trying to think of alternatives for the three values, but have
>> not come up with anything better than David's suggestion.
>
> Well, then "Possible" would refer to the SQL standard behavior,
> which seems kind of an odd thing to emphasize there.  The field
> really needs to be "SQL-standard possible, PostgreSQL not
> possible", but that is too long.  This is why I split it into
> separate lines.  We could try "Possible (SQL standard), Not
> possible (PostgreSQL)".

Yeah, I was searching for some wording that conveyed that the
standard *allowed* an implementation to present such phenomena at
the isolation level versus whether the PostgreSQL implementation
could *actually* present such phenomena.  In struggling to come up
with an analogy, the best I can do is that it's like each person
fishing for rainbow trout in Wisconsin is *allowed* to keep it if
it is at least 26 inches long; some people will do so, and some
catch and release.  Regulations say that it is possible to keep it
(and not be in violation of the rules), but you are not required to
keep it.  For REPEATABLE READ, the SQL standard says that any
product would be *allowed* to have phantom reads, but is not
*required* to; we, as a community, choose not to.

Maybe something like "Prohibited", "Allowed but not Possible", and
"Possible"?  That would take a little explaining above, since our
documentation's table would be deviating from the standard's table
in its word choice.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add a new table for Transaction Isolation?

От

Bruce Momjian

Дата:

25 апреля 2015 г., 23:57:14

On Sat, Apr 25, 2015 at 08:47:47PM +0000, Kevin Grittner wrote:
> Maybe something like "Prohibited", "Allowed but not Possible", and
> "Possible"?  That would take a little explaining above, since our
> documentation's table would be deviating from the standard's table
> in its word choice.

I can't even process that.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Re: Add a new table for Transaction Isolation?

От

"David G. Johnston"

Дата:

26 апреля 2015 г., 00:09:49

On Saturday, April 25, 2015, Kevin Grittner <kgrittn@ymail.com> wrote:

Bruce Momjian <bruce@momjian.us> wrote:
> On Sat, Apr 25, 2015 at 07:45:35PM +0000, Kevin Grittner wrote:

>> We could perhaps have the column header say "Non-Serializable
>> Behavior" or some such; but I think we need to define whatever
>> term we use for the new column header.
>
> I don't think we can define the column as a negative, e.g.
> "Non-".

Yeah, that would tend to add to confusion. The basic issue is
whether there are any one-at-a-time orders of execution that could
yield the same results, or whether there is a cycle in an attempt
to graph such an order. "Cycles in Apparent Order of Execution"
would be accurate, but it's kinda long, and possibly too arcane.

"Monitored"?

Are multiple transactions, that do not write to the same rows, monitored so that read dependencies between them are detected and a serialization error raised?

>>>> Pondering whether something like: "Possible (not in PG)" and
>>>> avoiding the additional rows would make reading the table
>>>> easier.
>>>
>>> Uh, that's an idea. I thought visually having two separate
>>> lines was cleaner.
>>
>> I think one row per transaction isolation level, with three
>> possible values per cell, would be the cleanest. I have been
>> trying to think of alternatives for the three values, but have
>> not come up with anything better than David's suggestion.
>
> Well, then "Possible" would refer to the SQL standard behavior,
> which seems kind of an odd thing to emphasize there. The field
> really needs to be "SQL-standard possible, PostgreSQL not
> possible", but that is too long. This is why I split it into
> separate lines. We could try "Possible (SQL standard), Not
> possible (PostgreSQL)".

Yeah, I was searching for some wording that conveyed that the
standard *allowed* an implementation to present such phenomena at
the isolation level versus whether the PostgreSQL implementation
could *actually* present such phenomena. In struggling to come up
with an analogy, the best I can do is that it's like each person
fishing for rainbow trout in Wisconsin is *allowed* to keep it if
it is at least 26 inches long; some people will do so, and some
catch and release. Regulations say that it is possible to keep it
(and not be in violation of the rules), but you are not required to
keep it. For REPEATABLE READ, the SQL standard says that any
product would be *allowed* to have phantom reads, but is not
*required* to; we, as a community, choose not to.

Maybe something like "Prohibited", "Allowed but not Possible", and
"Possible"? That would take a little explaining above, since our
documentation's table would be deviating from the standard's table
in its word choice.

Paraphrasing here...

Table # presents the postgresql implementation of the sql standard isolation levels and notes the additional impermissible behaviors by including "(contra-SQL)" in the cell. "Contrary to the SQL standard" - the imprecision in the term seems acceptable.

Not Possible (contra-SQL)

Re: Add a new table for Transaction Isolation?

От

"David G. Johnston"

Дата:

26 апреля 2015 г., 00:14:42

On Saturday, April 25, 2015, Bruce Momjian <bruce@momjian.us> wrote:

On Sat, Apr 25, 2015 at 08:47:47PM +0000, Kevin Grittner wrote:
> Maybe something like "Prohibited", "Allowed but not Possible", and
> "Possible"? That would take a little explaining above, since our
> documentation's table would be deviating from the standard's table
> in its word choice.

I can't even process that.

After writing my thoughts this makes sense now. Prohibited means that both tables would say not possible. Possible means both tables would say possible. Allowed but not possible means our implementation says not possible and the standard says it is possible. The fourth possibility, not allowed but possible, would mean we are not standard conforming and since we are it never appears.

I would probably choose "not possible (contra-SQL)" and emphasize our implementation and footnote the two differences.

David J.

Re: Add a new table for Transaction Isolation?

От

"David G. Johnston"

Дата:

26 апреля 2015 г., 00:54:04

On Saturday, April 25, 2015, David G. Johnston <david.g.johnston@gmail.com> wrote:

On Saturday, April 25, 2015, Kevin Grittner <kgrittn@ymail.com> wrote:
Bruce Momjian <bruce@momjian.us> wrote:
> On Sat, Apr 25, 2015 at 07:45:35PM +0000, Kevin Grittner wrote:

>> We could perhaps have the column header say "Non-Serializable
>> Behavior" or some such; but I think we need to define whatever
>> term we use for the new column header.
>
> I don't think we can define the column as a negative, e.g.
> "Non-".

Yeah, that would tend to add to confusion. The basic issue is
whether there are any one-at-a-time orders of execution that could
yield the same results, or whether there is a cycle in an attempt
to graph such an order. "Cycles in Apparent Order of Execution"
would be accurate, but it's kinda long, and possibly too arcane.

"Monitored"?

Are multiple transactions, that do not write to the same rows, monitored so that read dependencies between them are detected and a serialization error raised?

>>>> Pondering whether something like: "Possible (not in PG)" and
>>>> avoiding the additional rows would make reading the table
>>>> easier.
>>>
>>> Uh, that's an idea. I thought visually having two separate
>>> lines was cleaner.
>>
>> I think one row per transaction isolation level, with three
>> possible values per cell, would be the cleanest. I have been
>> trying to think of alternatives for the three values, but have
>> not come up with anything better than David's suggestion.
>
> Well, then "Possible" would refer to the SQL standard behavior,
> which seems kind of an odd thing to emphasize there. The field
> really needs to be "SQL-standard possible, PostgreSQL not
> possible", but that is too long. This is why I split it into
> separate lines. We could try "Possible (SQL standard), Not
> possible (PostgreSQL)".

Yeah, I was searching for some wording that conveyed that the
standard *allowed* an implementation to present such phenomena at
the isolation level versus whether the PostgreSQL implementation
could *actually* present such phenomena. In struggling to come up
with an analogy, the best I can do is that it's like each person
fishing for rainbow trout in Wisconsin is *allowed* to keep it if
it is at least 26 inches long; some people will do so, and some
catch and release. Regulations say that it is possible to keep it
(and not be in violation of the rules), but you are not required to
keep it. For REPEATABLE READ, the SQL standard says that any
product would be *allowed* to have phantom reads, but is not
*required* to; we, as a community, choose not to.

Maybe something like "Prohibited", "Allowed but not Possible", and
"Possible"? That would take a little explaining above, since our
documentation's table would be deviating from the standard's table
in its word choice.

Paraphrasing here...

Table # presents the postgresql implementation of the sql standard isolation levels and notes the additional impermissible behaviors by including "(contra-SQL)" in the cell. "Contrary to the SQL standard" - the imprecision in the term seems acceptable.

Not Possible (contra-SQL)

I'd also consider a 5th column to denote whether a serialization failure is possible in the first place and then the monitor column would distinguish between repeatable read and serializable.

David J.

Re: Add a new table for Transaction Isolation?

От

Bruce Momjian

Дата:

29 апреля 2015 г., 01:25:19

On Sat, Apr 25, 2015 at 02:54:00PM -0700, David G. Johnston wrote:
>     Paraphrasing here...
>
>     Table # presents the postgresql implementation of the sql standard
>     isolation levels and notes the additional impermissible behaviors by
>     including "(contra-SQL)" in the cell.  "Contrary to the SQL standard" - the
>     imprecision in the term seems acceptable.
>
>     Not Possible (contra-SQL)
>
>
>   I'd also consider a 5th column to denote whether a serialization failure is
> possible in the first place and then the monitor column would distinguish
> between repeatable read and serializable.

I think the showing a serialization failure column is too much to add to
the table.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Re: Add a new table for Transaction Isolation?

От

Bruce Momjian

Дата:

29 апреля 2015 г., 01:26:09

On Sat, Apr 25, 2015 at 02:14:37PM -0700, David G. Johnston wrote:
> On Saturday, April 25, 2015, Bruce Momjian <bruce@momjian.us> wrote:
>
>     On Sat, Apr 25, 2015 at 08:47:47PM +0000, Kevin Grittner wrote:
>     > Maybe something like "Prohibited", "Allowed but not Possible", and
>     > "Possible"?  That would take a little explaining above, since our
>     > documentation's table would be deviating from the standard's table
>     > in its word choice.
>
>     I can't even process that.
>
>
>
> After writing my thoughts this makes sense now.  Prohibited means that both
> tables would say not possible.  Possible means both tables would say possible. 
> Allowed but not possible means our implementation says not possible and the
> standard says it is possible.  The fourth possibility, not allowed but
> possible, would mean we are not standard conforming and since we are it never
> appears.
>
> I would probably choose "not possible (contra-SQL)" and emphasize our
> implementation and footnote the two differences.

I went with "Allowed, but not in PG" for those two fields, and removed
the extra rows I had added.  You can see the output here:

    http://momjian.us/expire/transaction-iso.html

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Re: Add a new table for Transaction Isolation?

От

Kevin Grittner

Дата:

29 апреля 2015 г., 17:15:34

Bruce Momjian <bruce@momjian.us> wrote:

> I went with "Allowed, but not in PG" for those two fields, and
> removed the extra rows I had added.  You can see the output here:
>
>     http://momjian.us/expire/transaction-iso.html

Looks great!

The only suggestion I can think to make to the table itself is to
make the new column header singular, to match the other columns.
I do think we should define the term used in the new column header;
maybe something like this:

serialization anomaly

    The result of successfully committing a group of transactions
    is inconsistent with all possible orderings of running those
    transactions one at a time.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add a new table for Transaction Isolation?

От

Bruce Momjian

Дата:

29 апреля 2015 г., 23:08:24

On Wed, Apr 29, 2015 at 02:15:20PM +0000, Kevin Grittner wrote:
> Bruce Momjian <bruce@momjian.us> wrote:
>
> > I went with "Allowed, but not in PG" for those two fields, and
> > removed the extra rows I had added.  You can see the output here:
> >
> >     http://momjian.us/expire/transaction-iso.html
>
>
> Looks great!
>
> The only suggestion I can think to make to the table itself is to
> make the new column header singular, to match the other columns.
> I do think we should define the term used in the new column header;
> maybe something like this:
>
>
> serialization anomaly
>
>     The result of successfully committing a group of transactions
>     is inconsistent with all possible orderings of running those
>     transactions one at a time.

OK, output updated:

    http://momjian.us/expire/transaction-iso.html

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Re: Add a new table for Transaction Isolation?

От

Kevin Grittner

Дата:

29 апреля 2015 г., 23:13:16

Bruce Momjian <bruce@momjian.us> wrote:

> updated:
>
>     http://momjian.us/expire/transaction-iso.html


I can't think of any way to improve on that.


--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add a new table for Transaction Isolation?

От

Bruce Momjian

Дата:

11 мая 2015 г., 19:05:24

On Wed, Apr 29, 2015 at 04:08:15PM -0400, Bruce Momjian wrote:
> On Wed, Apr 29, 2015 at 02:15:20PM +0000, Kevin Grittner wrote:
> > Bruce Momjian <bruce@momjian.us> wrote:
> >
> > > I went with "Allowed, but not in PG" for those two fields, and
> > > removed the extra rows I had added.  You can see the output here:
> > >
> > >     http://momjian.us/expire/transaction-iso.html
> >
> >
> > Looks great!
> >
> > The only suggestion I can think to make to the table itself is to
> > make the new column header singular, to match the other columns.
> > I do think we should define the term used in the new column header;
> > maybe something like this:
> >
> >
> > serialization anomaly
> >
> >     The result of successfully committing a group of transactions
> >     is inconsistent with all possible orderings of running those
> >     transactions one at a time.
>
> OK, output updated:
>
>     http://momjian.us/expire/transaction-iso.html

Patch applied.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Add a new table for Transaction Isolation?

Вложения

Вложения