Обсуждение: GSOC13 proposal - extend RETURNING syntax
Hello, I'm student who want to participate in Google Summer of Code. I want to implement feature which allows to get old values directly from update statement. I mean there should be possibility to choose the value immedietly before or after update in RETURNING statement. The syntax may be realized as "aliases". That means: OLD keywordswould be alias to row before update and NEW to row after update. The conclusion of syntax is: UPDATE foo SET bar=bar+1 RETURNING OLD.bar AS old_bar, NEW.bar AS new_bar; UPDATE foo SET ... RETURNING NEW.* will be equivalent to UPDATE foo SET ... RETURNING foo.* It may be possible to add similar syntax to DELETE and INSERT statements but I'm not sure if it makes a sense (OLD for DELETE will be alias to row before delete, NEW for INSERT will be alias to row after insert and all triggers - however what about NEW for delete and OLD for INSERT?). Additionally NEW and OLD values will be reserved keywords (it might be some capability problem since in new PostgreSQL it isn't reserved - however standard says it is and in old PgSQL it was). I'd like to hear (read) yours feedback about syntax and/or implement issues related to this proposal. Regards, Karol Trzcionka
On Thu, May 02, 2013 at 11:04:15AM +0200, Karol Trzcionka wrote: > Hello, > I'm student who want to participate in Google Summer of Code. I want to > implement feature which allows to get old values directly from update > statement. I mean there should be possibility to choose the value > immedietly before or after update in RETURNING statement. The syntax may > be realized as "aliases". That means: OLD keywordswould be alias to row > before update and NEW to row after update. The conclusion of syntax is: > UPDATE foo SET bar=bar+1 RETURNING OLD.bar AS old_bar, NEW.bar AS new_bar; > UPDATE foo SET ... RETURNING NEW.* will be equivalent to UPDATE foo SET > ... RETURNING foo.* > It may be possible to add similar syntax to DELETE and INSERT statements > but I'm not sure if it makes a sense (OLD for DELETE will be alias to > row before delete, NEW for INSERT will be alias to row after insert and > all triggers - however what about NEW for delete and OLD for INSERT?). > Additionally NEW and OLD values will be reserved keywords (it might be > some capability problem since in new PostgreSQL it isn't reserved - > however standard says it is and in old PgSQL it was). > I'd like to hear (read) yours feedback about syntax and/or implement > issues related to this proposal. > Regards, > Karol Trzcionka I would like to include the proposal as we've hammered it out together on IRC and on GSoC site below. Cheers, David. 1. As the SQL standard mandates that OLD and NEW be reserved words, we'll re-reserve them. 2. Let's make OLD and NEW have the same meaning that INSERT/UPDATE/DELETE have when returning rows from the changed table. In particular INSERT INTO foo (...) RETURNING NEW.* will be equivalent to INSERT INTO foo(...) RETURNING foo.* Similarly for UPDATE and DELETE: UPDATE foo SET ... RETURNING NEW.* will be equivalent to UPDATE foo SET ... RETURNING foo.* and DELETE FROM foo ... RETURNING OLD.* will be equivalent to DELETE FROM foo ... RETURNING foo.* As RETURNING clauses have access to everything in the FROM/USING clause, it is important to limit the NEW/OLD rows as beingonly those in the table being written to in the statement. 3. Let's add an option to UPDATE so that it can RETURN OLD with the same characteristics as above, namely that it refersonly to constants and columns in the updated table and not to everything available from the USING clause if included. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes: > 1. As the SQL standard mandates that OLD and NEW be reserved words, we'll re-reserve them. As I mentioned yesterday, I'm not exactly thrilled with re-reserving those, and especially not NEW as it is utterly unnecessary (since the default would already be to return the NEW column). It should in any case be possible to do this without converting them to reserved words; rather the implementation could be that those table aliases are made available when parsing the UPDATE RETURNING expressions. (This is, in fact, the way that rules use these names now.) Probably it should work something like "add these aliases if they don't already exist in the query", so as to avoid breaking existing applications. I don't really see a lot of value in hacking the behavior of either INSERT RETURNING or DELETE RETURNING. regards, tom lane
W dniu 02.05.2013 17:17, Tom Lane pisze: > It should in any case be possible to do this without converting them > to reserved words; rather the implementation could be that those table > aliases are made available when parsing the UPDATE RETURNING > expressions. (This is, in fact, the way that rules use these names > now.) Probably it should work something like "add these aliases if > they don't already exist in the query", so as to avoid breaking > existing applications. I don't really see a lot of value in hacking > the behavior of either INSERT RETURNING or DELETE RETURNING. regards, > tom lane I'm not sure about it. If it is not reserved keyword how can user get old value from table named "old". The new value is not a problem (doesn't conflict) but what should happened in statement: UPDATE old SET foo=foo+1 RETURNING old.foo; If it returns old value, it'll break capability. If it returns new value (as now), there is no possibility to get old value (and user cries because of broken feature). Regards, Karol Trzcionka
On 2013-05-02 17:32, Karol Trzcionka wrote: > W dniu 02.05.2013 17:17, Tom Lane pisze: >> It should in any case be possible to do this without converting them >> to reserved words; rather the implementation could be that those table >> aliases are made available when parsing the UPDATE RETURNING >> expressions. (This is, in fact, the way that rules use these names >> now.) Probably it should work something like "add these aliases if >> they don't already exist in the query", so as to avoid breaking >> existing applications. I don't really see a lot of value in hacking >> the behavior of either INSERT RETURNING or DELETE RETURNING. > > what should happened in statement: > UPDATE old SET foo=foo+1 RETURNING old.foo; > If it returns old value, it'll break capability. If it returns new value > (as now), there is no possibility to get old value (and user cries > because of broken feature). In Tom's proposal that would give you the "new" value. Personally I would maybe prefer reserving NEW/OLD, but if we go through with Tom's idea, this should work: UPDATE old myold SET foo=foo+1 RETURNING myold.foo, old.foo; What I'm more interested in is: how can we make this feature work in PL/PgSQL where OLD means something different? Regards, Marko Tiikkaja
Marko Tiikkaja <marko@joh.to> writes: > What I'm more interested in is: how can we make this feature work in > PL/PgSQL where OLD means something different? That's a really good point: if you follow this approach then you're creating fundamental conflicts for use of the feature in trigger functions or rules, which will necessarily have conflicting uses of those names. Yeah, we could define scoping rules such that there's an unambiguous interpretation, but then the user is just out of luck if he wants to reference the other definition. (This problem is probably actually worse if you implement with reserved words rather than aliases.) I'm thinking it would be better to invent some other notation for access to old-row values. regards, tom lane
On 2013-05-02 12:23:19 -0400, Tom Lane wrote: > Marko Tiikkaja <marko@joh.to> writes: > > What I'm more interested in is: how can we make this feature work in > > PL/PgSQL where OLD means something different? > > That's a really good point: if you follow this approach then you're > creating fundamental conflicts for use of the feature in trigger > functions or rules, which will necessarily have conflicting uses of > those names. Yeah, we could define scoping rules such that there's > an unambiguous interpretation, but then the user is just out of luck > if he wants to reference the other definition. (This problem is > probably actually worse if you implement with reserved words rather > than aliases.) > > I'm thinking it would be better to invent some other notation for access > to old-row values. prior/after? Both are unreserved keywords atm and it seems far less likely to have conflicts than new/old. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Thu, May 02, 2013 at 06:28:53PM +0200, Andres Freund wrote: > On 2013-05-02 12:23:19 -0400, Tom Lane wrote: > > Marko Tiikkaja <marko@joh.to> writes: > > > What I'm more interested in is: how can we make this feature work in > > > PL/PgSQL where OLD means something different? > > > > That's a really good point: if you follow this approach then you're > > creating fundamental conflicts for use of the feature in trigger > > functions or rules, which will necessarily have conflicting uses of > > those names. Yeah, we could define scoping rules such that there's > > an unambiguous interpretation, but then the user is just out of luck > > if he wants to reference the other definition. (This problem is > > probably actually worse if you implement with reserved words rather > > than aliases.) > > > > I'm thinking it would be better to invent some other notation for access > > to old-row values. > > prior/after? Both are unreserved keywords atm and it seems far less > likely to have conflicts than new/old. BEFORE/AFTER seems more logical to me. Yes, those words both have meaning in, for example, a trigger definition, but they're clearly separable by context. Yay, bike-shedding! Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
2013/5/2 Tom Lane <tgl@sss.pgh.pa.us>
Marko Tiikkaja <marko@joh.to> writes:That's a really good point: if you follow this approach then you're
> What I'm more interested in is: how can we make this feature work in
> PL/PgSQL where OLD means something different?
creating fundamental conflicts for use of the feature in trigger
functions or rules, which will necessarily have conflicting uses of
those names. Yeah, we could define scoping rules such that there's
an unambiguous interpretation, but then the user is just out of luck
if he wants to reference the other definition. (This problem is
probably actually worse if you implement with reserved words rather
than aliases.)
I'm thinking it would be better to invent some other notation for access
to old-row values.
I am not sure, but I am thinking so NEW and OLD are used in some statements and features ANSI SQL 2012, so probably we should to do keywords from these words if we would to support modern ANSI SQL
Regards
Pavel
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
David Fetter <david@fetter.org> writes: > On Thu, May 02, 2013 at 06:28:53PM +0200, Andres Freund wrote: >> prior/after? Both are unreserved keywords atm and it seems far less >> likely to have conflicts than new/old. > BEFORE/AFTER seems more logical to me. Works for me. regards, tom lane
W dniu 02.05.2013 19:40, Tom Lane pisze: >> BEFORE/AFTER seems more logical to me. > Works for me. > What do you think about function- or cast-like syntax. I mean: RETURNING OLD(foo.bar) or RETURNING OLD(foo).bar or RETURNING (foo::OLD).bar ? I think none of them should conflict with any other statements. Regards, Karol Trzcionka
On Thu, May 02, 2013 at 01:40:59PM -0400, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > On Thu, May 02, 2013 at 06:28:53PM +0200, Andres Freund wrote: > >> prior/after? Both are unreserved keywords atm and it seems far less > >> likely to have conflicts than new/old. > > > BEFORE/AFTER seems more logical to me. > > Works for me. > > regards, tom lane Maybe we can make BEFORE and AFTER implied aliases rather than keywords. What say? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Sent from my iPad On 03-May-2013, at 0:07, David Fetter <david@fetter.org> wrote: > On Thu, May 02, 2013 at 01:40:59PM -0400, Tom Lane wrote: >> David Fetter <david@fetter.org> writes: >>> On Thu, May 02, 2013 at 06:28:53PM +0200, Andres Freund wrote: >>>> prior/after? Both are unreserved keywords atm and it seems far less >>>> likely to have conflicts than new/old. >> >>> BEFORE/AFTER seems more logical to me. >> >> Works for me. >> >> regards, tom lane > > Maybe we can make BEFORE and AFTER implied aliases rather than > keywords. What say? > > I agree.Overall,I like the concept. Regards, Atri
David Fetter <david@fetter.org> writes: > Maybe we can make BEFORE and AFTER implied aliases rather than > keywords. What say? Well, they still have to be unreserved keywords for their use in trigger-related commands, but as far as this feature is concerned I think they're best off being handled as automatically-supplied aliases. IOW the patch needn't touch gram.y at all. regards, tom lane
Karol Trzcionka <karlikt@gmail.com> writes: > What do you think about function- or cast-like syntax. I mean: > RETURNING OLD(foo.bar) > or RETURNING OLD(foo).bar > or RETURNING (foo::OLD).bar ? > I think none of them should conflict with any other statements. I think you'll find those alternatives are at least as ugly to implement as they are to look at ... regards, tom lane
W dniu 02.05.2013 20:45, Tom Lane pisze: > David Fetter <david@fetter.org> writes: >> Maybe we can make BEFORE and AFTER implied aliases rather than >> keywords. What say? > Well, they still have to be unreserved keywords for their use in > trigger-related commands, but as far as this feature is concerned > I think they're best off being handled as automatically-supplied > aliases. IOW the patch needn't touch gram.y at all. Well... the syntax which wouldn't conflict with Pl/PgSQL is (draft): INSERT INTO foo ... RETURNING AFTER.* UPDATE foo SET ... RETURNING AFTER.*, BEFORE.* DELETE FROM foo ... RETURNING BEFORE.* It will not solve the problems: 1. How to access to old rows if the table is named "BEFORE"? 2. Should AFTER for DELETE and BEFORE for INSERT be allowed? If yes what should they return? I mean: what should be result of: INSERT INTO foo ... RETURNING BEFORE.* and DELETE FROM foo ... RETURNING AFTER.* ? The best summarize of dropping NEW/OLD keywords for this proposal was proposed while IRC meeting: create or replace function ft1(integer) returns integer language plpgsql as $f$ <<x>> declare r rt1; begin select 1 as a into r; update rt1 r set a = a + 1 returning XXX into r; raise notice 'r = %', r; return null; end; $f$; Regards, Karol Trzcionka
<div class="moz-cite-prefix">On 03/05/13 04:52, David Fetter wrote:<br /></div><blockquote cite="mid:20130502165238.GB12887@fetter.org"type="cite"><pre wrap="">On Thu, May 02, 2013 at 06:28:53PM +0200, Andres Freundwrote: </pre><blockquote type="cite"><pre wrap="">On 2013-05-02 12:23:19 -0400, Tom Lane wrote: </pre><blockquote type="cite"><pre wrap="">Marko Tiikkaja <a class="moz-txt-link-rfc2396E" href="mailto:marko@joh.to"><marko@joh.to></a>writes: </pre><blockquote type="cite"><pre wrap="">What I'm more interested in is: how can we make this feature work in PL/PgSQL where OLD means something different? </pre></blockquote><pre wrap=""> That's a really good point: if you follow this approach then you're creating fundamental conflicts for use of the feature in trigger functions or rules, which will necessarily have conflicting uses of those names. Yeah, we could define scoping rules such that there's an unambiguous interpretation, but then the user is just out of luck if he wants to reference the other definition. (This problem is probably actually worse if you implement with reserved words rather than aliases.) I'm thinking it would be better to invent some other notation for access to old-row values. </pre></blockquote><pre wrap=""> prior/after? Both are unreserved keywords atm and it seems far less likely to have conflicts than new/old. </pre></blockquote><pre wrap=""> BEFORE/AFTER seems more logical to me. Yes, those words both have meaning in, for example, a trigger definition, but they're clearly separable by context. Yay, bike-shedding! Cheers, David. </pre></blockquote><font size="-1">I prefer 'PRIOR & 'AFTER'</font> as the both have the same length <br /> - but perhapsthat is just me! :-)<br /><br /><br /> Cheers,<br /> Gavin<br /><br /><br /><br />
<div class="moz-cite-prefix">W dniu 02.05.2013 23:34, Gavin Flower pisze:<br /></div><blockquote cite="mid:5182DBD9.5050501@archidevsys.co.nz"type="cite"><font size="-1">I prefer 'PRIOR & 'AFTER'</font> as the bothhave the same length <br /> - but perhaps that is just me! :-)<br /></blockquote> I think it doesn't matter at now becausePRIOR has the same problems as BEFORE ;)<br /> Regards,<br /> Karol<br />
Karol Trzcionka <karlikt@gmail.com> writes: > It will not solve the problems: > 1. How to access to old rows if the table is named "BEFORE"? The user can simply choose to use a different table alias, as Andres explained upthread. If any table's active alias is "before" (or "after"), we just don't activate the corresponding implicit alias. > 2. Should AFTER for DELETE and BEFORE for INSERT be allowed? If yes what > should they return? These cases should certainly fail. Now, IMO there's no very good reason to alter the behavior at all for INSERT/DELETE; only UPDATE has an issue here. But if we were going to support the extra aliases in those commands, only the ones that actually make sense should be allowed. regards, tom lane
Hello, according to my mentor's suggestion, I send first PoC patch of "RETURNING AFTER/BEFORE" statement. Some info: - it is early version - more hack PoC than RC patch - AFTER in this version works as decribed before but it returns row after update but before post-update before (to be fixed or switch with current "*" behaviour - I don't know yet) - patch passes all regression tests - the code is uncommented already, to be fixed I'm not sure what may fail so you use it on your risk ;) Regards, Karol
Вложения
This is a proposal to implement functionalities for the handling of National Characters. [Introduction] The aim of this proposal is to eventually have a way to represent 'National Characters' in a uniform way, even in non-UTF8 encoded databases. Many of our customers in the Asian region who are now, as part of their platform modernization, are moving away from mainframes where they have used National Characters representation in COBOL and other databases. Having stronger support for national characters representation will also make it easier for these customers to look at PostgreSQL more favourably when migrating from other well known RDBMSs who all have varying degrees of NCHAR/NVARCHAR support. [Specifications] Broadly speaking, the national characters implementation ideally will include the following - Support for NCHAR/NVARCHAR data types - Representing NCHAR and NVARCHAR columns in UTF-8 encoding in non-UTF8 databases - Support for UTF16 column encoding and representing NCHAR and NVARCHAR columns in UTF16 encoding in all databases. - Support for NATIONAL_CHARACTER_SET GUC variable that will determine the encoding that will be used in NCHAR/NVARCHAR columns. The above points are at the moment a 'wishlist' only. Our aim is to tackle them one-by-one as we progress. I will send a detailed proposal later with more technical details. The main aim at the moment is to get some feedback on the above to know if this feature is something that would benefit PostgreSQL in general, and if users maintaining DBs in non-English speaking regions will find this beneficial. Rgds, Arul Shaji P.S.: It has been quite some time since I send a correspondence to this list. Our mail server adds a standard legal disclaimer to all outgoing mails, which I know that this list is not a huge fan of. I used to have an exemption for the mails I send to this list. If the disclaimer appears, apologies in advance. I will rectify that on the next one.
Arul Shaji, NCHAR support is on our TODO list for some time and I would like to welcome efforts trying to implement it. However I have a few questions: > This is a proposal to implement functionalities for the handling of > National Characters. > > [Introduction] > > The aim of this proposal is to eventually have a way to represent > 'National Characters' in a uniform way, even in non-UTF8 encoded > databases. Many of our customers in the Asian region who are now, as > part of their platform modernization, are moving away from mainframes > where they have used National Characters representation in COBOL and > other databases. Having stronger support for national characters > representation will also make it easier for these customers to look at > PostgreSQL more favourably when migrating from other well known RDBMSs > who all have varying degrees of NCHAR/NVARCHAR support. > > [Specifications] > > Broadly speaking, the national characters implementation ideally will > include the following > - Support for NCHAR/NVARCHAR data types > - Representing NCHAR and NVARCHAR columns in UTF-8 encoding in non-UTF8 > databases I think this is not a trivial work because we do not have framework to allow mixed encodings in a database. I'm interested in how you are going to solve the problem. > - Support for UTF16 column encoding and representing NCHAR and NVARCHAR > columns in UTF16 encoding in all databases. Why do yo need UTF-16 as the database encoding? UTF-8 is already supported, and any UTF-16 character can be represented in UTF-8 as far as I know. > - Support for NATIONAL_CHARACTER_SET GUC variable that will determine > the encoding that will be used in NCHAR/NVARCHAR columns. You said NCHAR's encoding is UTF-8. Why do you need the GUC if NCHAR's encoding is fixed to UTF-8? > The above points are at the moment a 'wishlist' only. Our aim is to > tackle them one-by-one as we progress. I will send a detailed proposal > later with more technical details. > > The main aim at the moment is to get some feedback on the above to know > if this feature is something that would benefit PostgreSQL in general, > and if users maintaining DBs in non-English speaking regions will find > this beneficial. > > Rgds, > Arul Shaji > > > > P.S.: It has been quite some time since I send a correspondence to this > list. Our mail server adds a standard legal disclaimer to all outgoing > mails, which I know that this list is not a huge fan of. I used to have > an exemption for the mails I send to this list. If the disclaimer > appears, apologies in advance. I will rectify that on the next one. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp
On Fri, Jul 5, 2013 at 2:02 AM, Tatsuo Ishii <ishii@postgresql.org> wrote: >> - Support for NATIONAL_CHARACTER_SET GUC variable that will determine >> the encoding that will be used in NCHAR/NVARCHAR columns. > > You said NCHAR's encoding is UTF-8. Why do you need the GUC if NCHAR's > encoding is fixed to UTF-8? Not only that, but I don't think it can be a GUC. Maybe a compile-time switch, but if it were a GUC, how do you handle an existing database in UTF-8 when the setting is switched to UTF-16? Re-encode everything? Store the encoding along each value? It's a mess. Either fix it at UTF-8, or make it a compile-time thing, I'd say.
Ishii san, Thank you for your positive and early response. > -----Original Message----- > From: Tatsuo Ishii [mailto:ishii@postgresql.org] > Sent: Friday, 5 July 2013 3:02 PM > To: Arulappan, Arul Shaji > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Proposal - Support for National Characters > functionality > > Arul Shaji, > > NCHAR support is on our TODO list for some time and I would like to welcome > efforts trying to implement it. However I have a few > questions: > > > This is a proposal to implement functionalities for the handling of > > National Characters. > > > > [Introduction] > > > > The aim of this proposal is to eventually have a way to represent > > 'National Characters' in a uniform way, even in non-UTF8 encoded > > databases. Many of our customers in the Asian region who are now, as > > part of their platform modernization, are moving away from mainframes > > where they have used National Characters representation in COBOL and > > other databases. Having stronger support for national characters > > representation will also make it easier for these customers to look at > > PostgreSQL more favourably when migrating from other well known RDBMSs > > who all have varying degrees of NCHAR/NVARCHAR support. > > > > [Specifications] > > > > Broadly speaking, the national characters implementation ideally will > > include the following > > - Support for NCHAR/NVARCHAR data types > > - Representing NCHAR and NVARCHAR columns in UTF-8 encoding in > > non-UTF8 databases > > I think this is not a trivial work because we do not have framework to allow > mixed encodings in a database. I'm interested in how you are going to solve > the problem. > I would be lying if I said I have the design already speced out. I will be working on this in the coming weeks and hope to design a working solution consulting with the community. > > - Support for UTF16 column encoding and representing NCHAR and > > NVARCHAR columns in UTF16 encoding in all databases. > > Why do yo need UTF-16 as the database encoding? UTF-8 is already supported, > and any UTF-16 character can be represented in UTF-8 as far as I know. > Yes, that's correct. However there are advantages in using UTF-16 encoding for those characters that are always going to take atleast two-bytes to represent. Having said that, my intention is to use UTF-8 for NCHAR as well. Supporting UTF-16 will be even more complicated as it is not supported natively in some Linux platforms. I only included it to give an option. > > - Support for NATIONAL_CHARACTER_SET GUC variable that will determine > > the encoding that will be used in NCHAR/NVARCHAR columns. > > You said NCHAR's encoding is UTF-8. Why do you need the GUC if NCHAR's > encoding is fixed to UTF-8? > If we are going to only support UTF-8 for NCHAR, then we don't need the GUC variable obviously. Rgds, Arul Shaji > > The above points are at the moment a 'wishlist' only. Our aim is to > > tackle them one-by-one as we progress. I will send a detailed proposal > > later with more technical details. > > > > The main aim at the moment is to get some feedback on the above to > > know if this feature is something that would benefit PostgreSQL in > > general, and if users maintaining DBs in non-English speaking regions > > will find this beneficial. > > > > Rgds, > > Arul Shaji > > > > > > > > P.S.: It has been quite some time since I send a correspondence to > > this list. Our mail server adds a standard legal disclaimer to all > > outgoing mails, which I know that this list is not a huge fan of. I > > used to have an exemption for the mails I send to this list. If the > > disclaimer appears, apologies in advance. I will rectify that on the next > one. > -- > Tatsuo Ishii > SRA OSS, Inc. Japan > English: http://www.sraoss.co.jp/index_en.php > Japanese: http://www.sraoss.co.jp
> -----Original Message----- > From: Claudio Freire [mailto:klaussfreire@gmail.com] > Sent: Friday, 5 July 2013 3:41 PM > To: Tatsuo Ishii > Cc: Arulappan, Arul Shaji; PostgreSQL-Dev > Subject: Re: [HACKERS] Proposal - Support for National Characters > functionality > > On Fri, Jul 5, 2013 at 2:02 AM, Tatsuo Ishii <ishii@postgresql.org> wrote: > >> - Support for NATIONAL_CHARACTER_SET GUC variable that will determine > >> the encoding that will be used in NCHAR/NVARCHAR columns. > > > > You said NCHAR's encoding is UTF-8. Why do you need the GUC if NCHAR's > > encoding is fixed to UTF-8? > > > Not only that, but I don't think it can be a GUC. Maybe a compile-time switch, > but if it were a GUC, how do you handle an existing database in UTF-8 when the > setting is switched to UTF-16? Re-encode everything? > Store the encoding along each value? It's a mess. > > Either fix it at UTF-8, or make it a compile-time thing, I'd say. Agreed, that to begin with we only support UTF-8 encoding for NCHAR columns. If that is the case, do we still need a compile time option to turn on/off NCHAR functionality ? ? Rgds, Arul Shaji
On 07/05/2013 02:12 AM, Arulappan, Arul Shaji wrote: > - Support for UTF16 column encoding and representing NCHAR and > NVARCHAR columns in UTF16 encoding in all databases. >> Why do yo need UTF-16 as the database encoding? UTF-8 is already > supported, >> and any UTF-16 character can be represented in UTF-8 as far as I know. >> > Yes, that's correct. However there are advantages in using UTF-16 > encoding for those characters that are always going to take atleast > two-bytes to represent. > Any suggestion to store data in utf-16 is likely to be a complete non-starter. I suggest you research our previously stated requirements for server side encodings. cheers andrew
On Fri, Jul 05, 2013 at 03:22:33AM +0200, Karol Trzcionka wrote: > Hello, > according to my mentor's suggestion, I send first PoC patch of > "RETURNING AFTER/BEFORE" statement. Some info: > - it is early version - more hack PoC than RC patch > - AFTER in this version works as decribed before but it returns row > after update but before post-update before (to be fixed or switch with > current "*" behaviour - I don't know yet) > - patch passes all regression tests > - the code is uncommented already, to be fixed > I'm not sure what may fail so you use it on your risk ;) > Regards, > Karol Karol, Per discussion in IRC, please follow up with your patch that includes such documentation, new regression tests, etc., you've written for the feature. Thanks! :) Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On 7/4/13 10:11 PM, Arulappan, Arul Shaji wrote: > The main aim at the moment is to get some feedback on the above to know > if this feature is something that would benefit PostgreSQL in general, > and if users maintaining DBs in non-English speaking regions will find > this beneficial. For European languages, I think everyone has moved to using Unicode, so the demand for supporting multiple encodings is approaching zero. The CJK realm might have difference requirements.
<p dir="ltr">Yes, what I know almost all use utf8 without problems. Long time I didn't see any request for multi encodingsupport. <div class="gmail_quote">Dne 5.7.2013 20:28 "Peter Eisentraut" <<a href="mailto:peter_e@gmx.net">peter_e@gmx.net</a>>napsal(a):<br type="attribution" /><blockquote class="gmail_quote" style="margin:00 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> On 7/4/13 10:11 PM, Arulappan, Arul Shaji wrote:<br/> > The main aim at the moment is to get some feedback on the above to know<br /> > if this feature is somethingthat would benefit PostgreSQL in general,<br /> > and if users maintaining DBs in non-English speaking regionswill find<br /> > this beneficial.<br /><br /> For European languages, I think everyone has moved to using Unicode,so<br /> the demand for supporting multiple encodings is approaching zero. The<br /> CJK realm might have differencerequirements.<br /><br /><br /><br /> --<br /> Sent via pgsql-hackers mailing list (<a href="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<br/> To make changes to your subscription:<br/><a href="http://www.postgresql.org/mailpref/pgsql-hackers" target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/></blockquote></div>
Updated patch: - include sgml - fix all compiler warnings - some cleanup - fix correctness of feature Regards, Karol
Вложения
Next version: - cleanup - regression test - fix issue reported by johto (invalid values in parallel transactions) I would like more feedback and comments about the patch, as some parts may be too hacky. In particular, is it a problem that I update a pointer to planSlot? In my patch, it points to tuple after all updates done between planner and executor (in fact it is not planSlot right now). I don't know whether the tuple could be deleted in the intervening time and if the pointer doesn't point to "unreserved" memory (I mean - memory which may be overwritten by something meanwhile). Regards, Karol
Вложения
On Sat, Jul 13, 2013 at 12:49:45AM +0200, Karol Trzcionka wrote: > Next version: > - cleanup > - regression test > - fix issue reported by johto (invalid values in parallel transactions) > I would like more feedback and comments about the patch, as some parts > may be too hacky. > In particular, is it a problem that I update a pointer to planSlot? In > my patch, it points to tuple after all updates done between planner and > executor (in fact it is not planSlot right now). I don't know whether > the tuple could be deleted in the intervening time and if the pointer > doesn't point to "unreserved" memory (I mean - memory which may be > overwritten by something meanwhile). Thanks for the updated patch! Anybody care to look this over for vulnerabilities as described above? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Fri, Jul 5, 2013 at 2:35 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > Yes, what I know almost all use utf8 without problems. Long time I didn't > see any request for multi encoding support. Well, not *everything* can be represented as UTF-8; I think this is particularly an issue with Asian languages. If we chose to do it, I think that per-column encoding support would end up looking a lot like per-column collation support: it would be yet another per-column property along with typoid, typmod, and typcollation. I'm not entirely sure it's worth it, although FWIW I do believe Oracle has something like this. At any rate, it seems like quite a lot of work. Another idea would be to do something like what we do for range types - i.e. allow a user to declare a type that is a differently-encoded version of some base type. But even that seems pretty hard. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> > On Fri, Jul 5, 2013 at 2:35 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > > Yes, what I know almost all use utf8 without problems. Long time I > > didn't see any request for multi encoding support. > > Well, not *everything* can be represented as UTF-8; I think this is > particularly an issue with Asian languages. > > If we chose to do it, I think that per-column encoding support would end up > looking a lot like per-column collation support: it would be yet another per- > column property along with typoid, typmod, and typcollation. I'm not entirely > sure it's worth it, although FWIW I do believe Oracle has something like this. Yes, the idea is that users will be able to declare columns of type NCHAR or NVARCHAR which will use the pre-determined encoding type. If we say that NCHAR is UTF-8 then the NCHAR column will be of UTF-8 encoding irrespective of the database encoding. It will be up to us to restrict what Unicode encodings we want to support for NCHAR/NVARCHAR columns. This is based on my interpretation of the SQL standard. As you allude to above, Oracle has a similar behaviour (they support UTF-16 as well). Support for UTF-16 will be difficult without linking with some external libraries such as ICU. > At any rate, it seems like quite a lot of work. Thanks for putting my mind at ease ;-) Rgds, Arul Shaji > > Another idea would be to do something like what we do for range types > - i.e. allow a user to declare a type that is a differently-encoded version of > some base type. But even that seems pretty hard. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Jul 15, 2013 at 4:37 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Jul 5, 2013 at 2:35 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> Yes, what I know almost all use utf8 without problems. Long time I didn't >> see any request for multi encoding support. > > Well, not *everything* can be represented as UTF-8; I think this is > particularly an issue with Asian languages. What cannot be represented as UTF-8? UTF-8 can represent every character in the Unicode character set, whereas UTF-16 can encode characters 0 to 0x10FFFF. Does support for alternative multi-byte encodings have something to do with the Han unification controversy? I don't know terribly much about this, so apologies if that's just wrong. -- Peter Geoghegan
>> On Fri, Jul 5, 2013 at 2:35 PM, Pavel Stehule > <pavel.stehule@gmail.com> wrote: >> > Yes, what I know almost all use utf8 without problems. Long time I >> > didn't see any request for multi encoding support. >> >> Well, not *everything* can be represented as UTF-8; I think this is >> particularly an issue with Asian languages. >> >> If we chose to do it, I think that per-column encoding support would > end up >> looking a lot like per-column collation support: it would be yet > another per- >> column property along with typoid, typmod, and typcollation. I'm not > entirely >> sure it's worth it, although FWIW I do believe Oracle has something > like this. > > Yes, the idea is that users will be able to declare columns of type > NCHAR or NVARCHAR which will use the pre-determined encoding type. If we > say that NCHAR is UTF-8 then the NCHAR column will be of UTF-8 encoding > irrespective of the database encoding. It will be up to us to restrict > what Unicode encodings we want to support for NCHAR/NVARCHAR columns. > This is based on my interpretation of the SQL standard. As you allude to > above, Oracle has a similar behaviour (they support UTF-16 as well). > > Support for UTF-16 will be difficult without linking with some external > libraries such as ICU. Can you please elaborate more on this? Why do you exactly need ICU? Also I don't understand why you need UTF-16 support as a database encoding because UTF-8 and UTF-16 are logically equivalent, they are just different represention (encoding) of Unicode. That means if we already support UTF-8 (I'm sure we already do), there's no particular reason we need to add UTF-16 support. Maybe you just want to support UTF-16 as a client encoding? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp
On Mon, Jul 15, 2013 at 8:58 AM, Tatsuo Ishii <ishii@postgresql.org> wrote: > Also I don't understand why you need UTF-16 support as a database > encoding because UTF-8 and UTF-16 are logically equivalent, they are > just different represention (encoding) of Unicode. That means if we > already support UTF-8 (I'm sure we already do), there's no particular > reason we need to add UTF-16 support. To be fair, there is a small reason to support UTF-16 even with UTF-8 available. I personally do not find it compelling, but perhaps I am not best placed to judge such things. As Wikipedia says on the the English UTF-8 article: "Characters U+0800 through U+FFFF use three bytes in UTF-8, but only two in UTF-16. As a result, text in (for example) Chinese, Japanese or Hindi could take more space in UTF-8 if there are more of these characters than there are ASCII characters. This happens for pure text but rarely for HTML documents. For example, both the Japanese UTF-8 and the Hindi Unicode articles on Wikipedia take more space in UTF-16 than in UTF-8." This is the only advantage of UTF-16 over UTF-8 as a server encoding. I'm inclined to take the fact that there has been so few (no?) complaints from PostgreSQL's large Japanese user-base about the lack of UTF-16 support as suggesting that that isn't considered to be a compelling feature in the CJK realm. -- Peter Geoghegan
> On Mon, Jul 15, 2013 at 4:37 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Fri, Jul 5, 2013 at 2:35 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> Yes, what I know almost all use utf8 without problems. Long time I didn't >>> see any request for multi encoding support. >> >> Well, not *everything* can be represented as UTF-8; I think this is >> particularly an issue with Asian languages. > > What cannot be represented as UTF-8? UTF-8 can represent every > character in the Unicode character set, whereas UTF-16 can encode > characters 0 to 0x10FFFF. > > Does support for alternative multi-byte encodings have something to do > with the Han unification controversy? I don't know terribly much about > this, so apologies if that's just wrong. There's a famous problem regarding conversion between Unicode and other encodings, such as Shift Jis. There are lots of discussion on this. Here is the one from Microsoft: http://support.microsoft.com/kb/170559/EN-US -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp
On 7/15/13 1:26 AM, Arulappan, Arul Shaji wrote: > Yes, the idea is that users will be able to declare columns of type > NCHAR or NVARCHAR which will use the pre-determined encoding type. If we > say that NCHAR is UTF-8 then the NCHAR column will be of UTF-8 encoding > irrespective of the database encoding. It will be up to us to restrict > what Unicode encodings we want to support for NCHAR/NVARCHAR columns. I would try implementing this as an extension at first, with a new data type that is internally encoded differently. We have citext as precedent for successfully implementing text-like data types in user space.
On Mon, Jul 15, 2013 at 05:11:40PM +0900, Tatsuo Ishii wrote: > > Does support for alternative multi-byte encodings have something to do > > with the Han unification controversy? I don't know terribly much about > > this, so apologies if that's just wrong. > > There's a famous problem regarding conversion between Unicode and other > encodings, such as Shift Jis. > > There are lots of discussion on this. Here is the one from Microsoft: > > http://support.microsoft.com/kb/170559/EN-US Apart from Shift-JIS not being a well defined (it's more a family of encodings) it has the unusual feature of providing multiple ways to encode the same character. This is not even a Han unification issue, they have largely been addressed. For example, the square-root symbol exists twice (0x8795 and 0x81E3) and many other mathmatical symbols also. Here's the code page which you can browse online: http://msdn.microsoft.com/en-us/goglobal/cc305152 Which means to be round-trippable Unicode would have to double those characters, but this would make it hard/impossible to round-trip with any other character set that had those characters. No easy solution here. Something that has been done before [1] is to map the doubles to the custom area of the unicode space (0xe000-0xffff). It gives you round-trip support at the expense of having to handle those characters yourself. But since postgres doesn't do anything meaningful with unicode characters this might be acceptable. [1] Python does a similar trick to handle filenames coming from disk in an unknown encoding: http://docs.python.org/3/howto/unicode.html#files-in-an-unknown-encoding Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer
New version: - fix returning "after" values if there are not "before" - add more regression tests I'd like to hear/read any feedback ;) Regards, Karol
Вложения
I've noticed problem with "UPDATE ... FROM" statement. Fix in new version. Regards, Karol
Вложения
On Mon, Jul 22, 2013 at 09:52:14PM +0200, Karol Trzcionka wrote: > I've noticed problem with "UPDATE ... FROM" statement. Fix in new version. > Regards, > Karol What problem or problems did you notice, and what did you change to fix them? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
W dniu 23.07.2013 06:22, David Fetter pisze: > What problem or problems did you notice, and what did you change to > fix them? "UPDATE ... FROM" generated "ERROR: variable not found in subplan target lists". I've added some workaround in add_vars_to_targetlist: - if it is an "after" - simple change var->varno to base RTE (it should always exists, the value is temporary, it will change to OUTER_VAR) - if it is a "before" - add to targetlist new var independently from rel->attr_needed[attno]. Additionally I've change build_joinrel_tlist to ignore any "BEFORE" RTEs. The regression tests are updated. Regards, Karol
> -----Original Message----- > From: Tatsuo Ishii [mailto:ishii@postgresql.org] > > > Also I don't understand why you need UTF-16 support as a database encoding > because UTF-8 and UTF-16 are logically equivalent, they are just different > represention (encoding) of Unicode. That means if we already support UTF-8 > (I'm sure we already do), there's no particular reason we need to add UTF-16 > support. > > Maybe you just want to support UTF-16 as a client encoding? Given below is a design draft for this functionality: Core new functionality (new code): 1)Create and register independent NCHAR/NVARCHAR/NTEXT data types. 2)Provide support for the new GUC nchar_collation to provide the database with information about the default collation that needs to be used for the new data types. 3)Create encoding conversion subroutines to convert strings between the database encoding and UTF8 (from national strings to regular strings and back). PostgreSQL already have all required support (used for conversion between the database encoding and client_encoding), so amount of the new code will be minimal there. 4)Because all symbols from non-UTF8 encodings could be represented as UTF8 (but the reverse is not true) comparison between N* types and the regular string types inside database will be performed in UTF8 form. To achieve this feature the new IMPLICIT casts may need to be created: NCHAR -> CHAR NVARCHAR -> VARCHAR NTEXT -> TEXT. Casting in the reverse direction will be available too but only as EXPLICIT. However, these casts could fail if national strings could not be represented in the used database encoding. All these casts will use subroutines created in 3). Casting/conversion between N* types will follow the same rules/mechanics as used for casting/conversion between usual (CHAR(N)/VARCHAR(N)/TEXT) string types. 5)Comparison between NATIONAL string values will be performed via specialized UTF8 optimized functions (with respect of the nchar_collation setting). 6)Client input/output of NATIONAL strings - NATIONAL strings will respect the client_encoding setting, and their values will be transparently converted to the requested client_encoding before sending(receiving) to client (the same mechanics as used for usual string types). So no mixed encoding in client input/output will be supported/available. 7)Create set of the regression tests for these new data types. Additional changes: 1)ECPG support for these new types 2) Support in the database drivers for the data types. Rgds, Arul Shaji > -- > Tatsuo Ishii > SRA OSS, Inc. Japan > English: http://www.sraoss.co.jp/index_en.php > Japanese: http://www.sraoss.co.jp
"Arulappan, Arul Shaji" <arul@fast.au.fujitsu.com> writes: > Given below is a design draft for this functionality: > Core new functionality (new code): > 1)Create and register independent NCHAR/NVARCHAR/NTEXT data types. > 2)Provide support for the new GUC nchar_collation to provide the > database with information about the default collation that needs to be > used for the new data types. A GUC seems like completely the wrong tack to be taking. In the first place, that would mandate just one value (at a time anyway) of collation, which is surely not much of an advance over what's already possible. In the second place, what happens if you change the value? All your indexes on nchar columns are corrupt, that's what. Actually the data itself would be corrupt, if you intend that this setting determines the encoding and not just the collation. If you really are speaking only of collation, it's not clear to me exactly what this proposal offers that can't be achieved today (with greater security, functionality and spec compliance) by using COLLATE clauses on plain text columns. Actually, you really haven't answered at all what it is you want to do that COLLATE can't do. > 4)Because all symbols from non-UTF8 encodings could be represented as > UTF8 (but the reverse is not true) comparison between N* types and the > regular string types inside database will be performed in UTF8 form. I believe that in some Far Eastern character sets there are some characters that map to the same Unicode glyph, but that some people would prefer to keep separate. So transcoding to UTF8 isn't necessarily lossless. This is one of the reasons why we've resisted adopting ICU or standardizing on UTF8 as the One True Database Encoding. Now this may or may not matter for comparison to strings that were in some other encoding to start with --- but as soon as you base your design on the premise that UTF8 is a universal encoding, you are sliding down a slippery slope to a design that will meet resistance. > 6)Client input/output of NATIONAL strings - NATIONAL strings will > respect the client_encoding setting, and their values will be > transparently converted to the requested client_encoding before > sending(receiving) to client (the same mechanics as used for usual > string types). > So no mixed encoding in client input/output will be supported/available. If you have this restriction, then I'm really failing to see what benefit there is over what can be done today with COLLATE. regards, tom lane
Hi everyone, I will try answer on all questions related to proposed National Characters support. >> 2)Provide support for the new GUC nchar_collation to provide the >> database with information about the default collation that needs to be >> used for the new data types. >A GUC seems like completely the wrong tack to be taking. In the first place, that would mandate just one value (at a time anyway) of collation, which is surely not much of an advance over what's already possible. In the second place, what happens if you change the value? >All your indexes on nchar columns are corrupt, that's what. Actually the data itself would be corrupt, if you intend that this setting determines the encoding and not just the collation. If you really are speaking only of collation, it's not clear to me exactly what this proposal offers that can't be achieved today (with greater security, >functionality and spec compliance) by using COLLATE clauses on plain text columns. >Actually, you really haven't answered at all what it is you want to do that COLLATE can't do. I think I give a wrong description there... it will be not GUC but GUC-type value which will be initialized during CREATE DATABASE and will be read only after, very similar to the lc_collate. So I think name national_lc_collate will be better. Function of this value - provide information about the default collation for the NATIONAL CHARACTERS inside the database. That's not limits user ability of use an alternative collation for NATIONAL CHARACTERS during create table via COLLATE keyword. E.g. if we have second encoding inside the database - we should have information about used collation somewhere. >> 4)Because all symbols from non-UTF8 encodings could be represented as >> UTF8 (but the reverse is not true) comparison between N* types and the >> regular string types inside database will be performed in UTF8 form. >I believe that in some Far Eastern character sets there are some characters that map to the same Unicode glyph, but that some people would prefer to keep separate. So transcoding to UTF8 isn't necessarily lossless. This is one of the reasons why we've resisted adopting ICU or standardizing on UTF8 as the One True Database Encoding. >Now this may or may not matter for comparison to strings that were in some other encoding to start with --- but as soon as you base your design on the premise that UTF8 is a universal encoding, you are sliding down a slippery slope to a design that will meet resistance. Will the conversion of both sides to the pg_wchar before comparison fix this problem? Anyway, if the database going to use more than one encoding, a some universal encoding should be used to allow comparison between them. After some analyse I think pg_wchar is better candidate to this role than UTF8. >> 6)Client input/output of NATIONAL strings - NATIONAL strings will >> respect the client_encoding setting, and their values will be >> transparently converted to the requested client_encoding before >> sending(receiving) to client (the same mechanics as used for usual >> string types). >> So no mixed encoding in client input/output will be supported/available. >If you have this restriction, then I'm really failing to see what benefit there is over what can be done today with COLLATE. There are two targets for this project: 1. Legacy database with non-utf8 encoding, which should support old non-utf8 applications and new UTF8 applications. In that case the old applications will use the legacy database encoding (and because these applications are legacy they doesn't work with new NATIONAL CHARACTERS data/tables). And the new applications will use client-side UTF8 encoding and they will be able store international texts in NATIONAL CHARACTER columns. Dump/restore of the whole database to change the database encoding to UTF8 not always possible, so there necessity of the some easy to use workaround. 2.Better compatibility with the ANSI SQL standard. Kind Regards, Maksym
Boguk, Maksym escribió: > I think I give a wrong description there... it will be not GUC but > GUC-type value which will be initialized during CREATE DATABASE and will > be read only after, very similar to the lc_collate. > So I think name national_lc_collate will be better. > Function of this value - provide information about the default collation > for the NATIONAL CHARACTERS inside the database. > That's not limits user ability of use an alternative collation for > NATIONAL CHARACTERS during create table via COLLATE keyword. This seems a bit odd. I mean, if I want the option for differing encodings, surely I need to be able to set them for each column, not at the database level. Also, as far as I understand what we want to control here is the encoding that the strings are in (the mapping of bytes to characters), not the collation (the way a set of strings are ordered). So it doesn't make sense to set the NATIONAL CHARACTER option using the COLLATE keyword. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Also, as far as I understand what we want to control here is the > encoding that the strings are in (the mapping of bytes to characters), > not the collation (the way a set of strings are ordered). So it doesn't > make sense to set the NATIONAL CHARACTER option using the COLLATE > keyword. My thought is that we should simply ignore the NATIONAL CHARACTER syntax, which is not the first nor the last brain-damaged feature design in the SQL standard. It's basically useless for what we want because there's noplace to specify which encoding you mean. Instead, let's consider that COLLATE can define not only the collation but also the encoding of a string datum. Contrary to what I think you meant above, that seems perfectly sensible to me, because after all a collation is necessarily a bunch of rules about how to order a particular set of characters. If the data representation you use is unable to represent that set of characters, it's not a very meaningful combination is it? There's still the problem of how do you get a string of a nondefault encoding into the database in the first place. If you have to convert to DB encoding to get it in there, then what's the use of a further conversion? This consideration may well kill the whole concept. (It certainly kills NATIONAL CHARACTER syntax just as much.) regards, tom lane
> From: Alvaro Herrera [mailto:alvherre@2ndquadrant.com] > > Boguk, Maksym escribió: > > > I think I give a wrong description there... it will be not GUC but > > GUC-type value which will be initialized during CREATE DATABASE and > > will be read only after, very similar to the lc_collate. > > So I think name national_lc_collate will be better. > > Function of this value - provide information about the default > > collation for the NATIONAL CHARACTERS inside the database. > > That's not limits user ability of use an alternative collation for > > NATIONAL CHARACTERS during create table via COLLATE keyword. > > This seems a bit odd. I mean, if I want the option for differing encodings, > surely I need to be able to set them for each column, not at the database > level. > > Also, as far as I understand what we want to control here is the encoding that > the strings are in (the mapping of bytes to characters), not the collation Yes, that is our idea too. For the sql syntax Create table tbl1 (col1 nchar); What should be the encoding and collation for col1? Because the idea is to have them in separate encoding and collation (ifneeded) from that of the rest of the table. We have options of a) Having guc variables that will determine the default encoding and collation for nchar/nvarchar columns. Note that thecollate variable is default only. Users can still override them per column. b) Having the encoding name and collation as part of the syntax. For ex., (col1 nchar encoding UTF-8 COLLATE "C"). Ugly,but..... c) Be rigid and say nchar/nvarchar columns are by default UTF-8 (or something else). One cannot change the default. But theycan override it when declaring the column by having a syntax similar to (b) Rgds, Arul Shaji > (the way a set of strings are ordered). So it doesn't make sense to set the > NATIONAL CHARACTER option using the COLLATE keyword. > > -- > Álvaro Herrera http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > Also, as far as I understand what we want to control here is the > > encoding that the strings are in (the mapping of bytes to characters), > > not the collation (the way a set of strings are ordered). So it > > doesn't make sense to set the NATIONAL CHARACTER option using the > > COLLATE keyword. > > My thought is that we should simply ignore the NATIONAL CHARACTER syntax, > which is not the first nor the last brain-damaged feature design in the SQL > standard. It's basically useless for what we want because there's noplace to > specify which encoding you mean. Instead, let's consider that COLLATE can > define not only the collation but also the encoding of a string datum. Yes, don't have a problem with this. If I understand you correctly, this will be simpler syntax wise, but still get nchar/nvarchar data types into a table, in different encoding from the rest of the table. > > There's still the problem of how do you get a string of a nondefault encoding > into the database in the first place. Yes, that is the bulk of the work. Will need change in a whole lot of places. Is a step-by-step approach worth exploring ? Something similar to: Step 1: Support nchar/nvarchar data types. Restrict them only to UTF-8 databases to begin with. Step 2: Support multiple encodings in a database. Remove the restriction imposed in step1. Rgds, Arul Shaji
Hi, mini-review follows. 2013-07-22 21:52 keltezéssel, Karol Trzcionka írta: > I've noticed problem with "UPDATE ... FROM" statement. Fix in new version. > Regards, > Karol * Does it apply cleanly to the current git master? No. There's a reject in src/backend/optimizer/plan/initsplan.c * Does it include reasonable tests? Yes but the test fails after trying to fix the rejected chunk of the patch. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
W dniu 19.08.2013 19:56, Boszormenyi Zoltan pisze: > * Does it apply cleanly to the current git master? > > No. There's a reject in src/backend/optimizer/plan/initsplan.c Thank you, merged in attached version. > > * Does it include reasonable tests? > > Yes but the test fails after trying to fix the rejected chunk of the > patch. It fails because the "HINT" was changed, fixed. That version merges some nested "ifs" left over from earlier work. Regards, Karol Trzcionka
Вложения
2013-08-19 21:21 keltezéssel, Karol Trzcionka írta: > W dniu 19.08.2013 19:56, Boszormenyi Zoltan pisze: >> * Does it apply cleanly to the current git master? >> >> No. There's a reject in src/backend/optimizer/plan/initsplan.c > Thank you, merged in attached version. >> * Does it include reasonable tests? >> >> Yes but the test fails after trying to fix the rejected chunk of the >> patch. > It fails because the "HINT" was changed, fixed. > That version merges some nested "ifs" left over from earlier work. I tried to compile your v5 patch and I got: initsplan.c: In function ‘add_vars_to_targetlist’: initsplan.c:216:26: warning: ‘rel’ may be used uninitialized in this function [-Wmaybe-uninitialized] rel->reltargetlist = lappend(rel->reltargetlist, ^ You shouldn't change the assignment at declaration: - RelOptInfo *rel = find_base_rel(root, var->varno); + RelOptInfo *rel; ... + if (root->parse->commandType == CMD_UPDATE) + { ... (code using rel) + } + rel = find_base_rel(root, varno); Best regards, Zoltán Böszörményi > Regards, > Karol Trzcionka -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Hi, 2013-08-19 21:52 keltezéssel, Boszormenyi Zoltan írta: > 2013-08-19 21:21 keltezéssel, Karol Trzcionka írta: >> W dniu 19.08.2013 19:56, Boszormenyi Zoltan pisze: >>> * Does it apply cleanly to the current git master? >>> >>> No. There's a reject in src/backend/optimizer/plan/initsplan.c >> Thank you, merged in attached version. >>> * Does it include reasonable tests? >>> >>> Yes but the test fails after trying to fix the rejected chunk of the >>> patch. >> It fails because the "HINT" was changed, fixed. >> That version merges some nested "ifs" left over from earlier work. > > I tried to compile your v5 patch and I got: > > initsplan.c: In function ‘add_vars_to_targetlist’: > initsplan.c:216:26: warning: ‘rel’ may be used uninitialized in this function > [-Wmaybe-uninitialized] > rel->reltargetlist = lappend(rel->reltargetlist, > ^ > > You shouldn't change the assignment at declaration: > > - RelOptInfo *rel = find_base_rel(root, var->varno); > + RelOptInfo *rel; > ... > + if (root->parse->commandType == CMD_UPDATE) > + { > ... (code using rel) > + } > + rel = find_base_rel(root, varno); Let me say it again: the new code in initsplan.c::add_vars_to_targetlist() is fishy. The compiler says that "rel" used on line 216 may be uninitialized. Keeping it that way passes "make check", perhaps "rel" was initialized in a previous iteration of "foreach(temp, vars)", possibly in the else if (IsA(node, PlaceHolderVar)) branch, which means that "PlaceHolderInfo *phinfo" may be interpreted as RelOptInfo *, stomping on memory. Moving the assignment back to the declaration makes "make check" fail with the attached regression.diffs file. Initializing it as "RelOptInfo *rel = NULL;" makes the regression check die with a segfault, obviously. Change the code to avoid the warning and still produce the wanted effect. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Вложения
Thank you for the review and tests. New version introduce a lot of improvements: - Fix regression test for view (wrong table_name) - Add regression test for inheritance - Delete hack in initsplan.c (now we ignore all RTE_BEFORE) - the uninitialized issue - Revert changing varno in add_vars_to_targetlist - Add all "before" variables to targetlist - Avoid adding variables to slot for AFTER. - Treat varnoold like a flag - prevent from adjustment if RTE_BEFORE - All before/after are now set on OUTER_VAR - Rename fix_varno_varattno to bind_returning_variables - Add comment about bind_returning_variables - Remove unneeded code in fix_join_expr_mutator (it was changing varno of RTE_BEFORE - now there is not any var with varno assigned to it) Regards, Karol Trzcionka
Вложения
W dniu 20.08.2013 16:47, Karol Trzcionka pisze: > Thank you for the review and tests. New version introduce a lot of > improvements: > - Fix regression test for view (wrong table_name) > - Add regression test for inheritance > - Delete hack in initsplan.c (now we ignore all RTE_BEFORE) - the > uninitialized issue > - Revert changing varno in add_vars_to_targetlist > - Add all "before" variables to targetlist > - Avoid adding variables to slot for AFTER. > - Treat varnoold like a flag - prevent from adjustment if RTE_BEFORE > - All before/after are now set on OUTER_VAR > - Rename fix_varno_varattno to bind_returning_variables > - Add comment about bind_returning_variables > - Remove unneeded code in fix_join_expr_mutator (it was changing varno > of RTE_BEFORE - now there is not any var with varno assigned to it) I've just realized the prepare_returning_before() is unneeded right now so I've removed it. Version 7, hopefully the last. ;) Regards, Karol Trzcionka
Вложения
<div class="moz-cite-prefix">2013-08-20 17:30 keltezéssel, Karol Trzcionka írta:<br /></div><blockquote cite="mid:52138BA4.1060600@gmail.com"type="cite"><pre wrap="">W dniu 20.08.2013 16:47, Karol Trzcionka pisze: </pre><blockquote type="cite"><pre wrap="">Thank you for the review and tests. New version introduce a lot of improvements: - Fix regression test for view (wrong table_name) - Add regression test for inheritance - Delete hack in initsplan.c (now we ignore all RTE_BEFORE) - the uninitialized issue - Revert changing varno in add_vars_to_targetlist - Add all "before" variables to targetlist - Avoid adding variables to slot for AFTER. - Treat varnoold like a flag - prevent from adjustment if RTE_BEFORE - All before/after are now set on OUTER_VAR - Rename fix_varno_varattno to bind_returning_variables - Add comment about bind_returning_variables - Remove unneeded code in fix_join_expr_mutator (it was changing varno of RTE_BEFORE - now there is not any var withvarno assigned to it) </pre></blockquote><pre wrap="">I've just realized the prepare_returning_before() is unneeded right now so I've removed it. Version 7, hopefully the last. ;)</pre></blockquote><br /> Here's a new one, for v7:<br /><br /> setrefs.c:In function ‘set_plan_refs’:<br /> setrefs.c:2001:26: warning: ‘before_index’ may be used uninitialized in thisfunction [-Wmaybe-uninitialized]<br /> bind_returning_variables(rlist, before_index, after_index);<br /> ^<br /> setrefs.c:1957:21: note: ‘before_index’ was declared here<br /> int after_index=0, before_index;<br/> ^<br /><br /> Best regards,<br /> Zoltán Böszörményi<br /><br /><blockquote cite="mid:52138BA4.1060600@gmail.com"type="cite"><pre wrap=""> Regards, Karol Trzcionka </pre><br /><fieldset class="mimeAttachmentHeader"></fieldset><br /><pre wrap=""> </pre></blockquote><br /><br /><pre class="moz-signature" cols="90">-- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de">http://www.postgresql-support.de</a> <aclass="moz-txt-link-freetext" href="http://www.postgresql.at/">http://www.postgresql.at/</a> </pre>
W dniu 20.08.2013 20:55, Boszormenyi Zoltan pisze:
Here's a new one, for v7:Right, my mistake. Sorry and thanks. Fixed.
setrefs.c: In function ‘set_plan_refs’:
setrefs.c:2001:26: warning: ‘before_index’ may be used uninitialized in this function [-Wmaybe-uninitialized]
bind_returning_variables(rlist, before_index, after_index);
^
setrefs.c:1957:21: note: ‘before_index’ was declared here
int after_index=0, before_index;
^
Regards,
Karol Trzcionka
Вложения
<div class="moz-cite-prefix">Hi,<br /><br /> 2013-08-20 21:06 keltezéssel, Karol Trzcionka írta:<br /></div><blockquote cite="mid:5213BE1D.1000705@gmail.com"type="cite"><div class="moz-cite-prefix">W dniu 20.08.2013 20:55, Boszormenyi Zoltanpisze:<br /></div><blockquote cite="mid:5213BB9D.4020204@cybertec.at" type="cite"> Here's a new one, for v7:<br /><br/> setrefs.c: In function ‘set_plan_refs’:<br /> setrefs.c:2001:26: warning: ‘before_index’ may be used uninitializedin this function [-Wmaybe-uninitialized]<br /> bind_returning_variables(rlist, before_index, after_index);<br/> ^<br /> setrefs.c:1957:21: note: ‘before_index’ was declared here<br /> intafter_index=0, before_index;<br /> ^<br /></blockquote> Right, my mistake. Sorry and thanks. Fixed.<br/> Regards,<br /> Karol Trzcionka<br /></blockquote><br /> With this fixed, a more complete review:<br /><br />* Is the patch in a patch format which has context? (eg: context diff format)<br /><br /> Yes.<br /><br /> * Does it applycleanly to the current git master?<br /><br /> Yes.<br /><br /> * Does it include reasonable tests, necessary doc patches,etc? <br /><br /> There is a new regression test (returning_before_after.sql) covering<br /> this feature. However,I think it should be added to the group<br /> where "returning.sql" resides currently. There is a value in runningit<br /> in parallel to other tests. Sometimes a subtle bug is uncovered<br /> because of this and your v2 patch fixedsuch a bug already.<br /><br /> doc/src/sgml/ref/update.sgml describes this feature.<br /><br /> doc/src/sgml/dml.sgmlshould also be touched to cover this feature.<br /><br /> * Does the patch actually implement what it'ssupposed to do?<br /><br /> Yes.<br /><br /> * Do we want that?<br /><br /> Yes.<br /><br /> * Do we already have it?<br/><br /> No.<br /><br /> * Does it follow SQL spec, or the community-agreed behavior?<br /><br /> RETURNING is a PostgreSQLextension, so the SQL-spec part<br /> of the question isn't applicable.<br /><br /> It implements the community-agreedbehaviour, according to<br /> the new regression test coverage.<br /><br /> * Does it include pg_dump support(if applicable)?<br /><br /> n/a<br /><br /> * Are there dangers?<br /><br /> I don't think so.<br /><br /> * Haveall the bases been covered?<br /><br /> It seems so. I have also tried mixing before/after columns in<br /> differentorders and the query didn't fail:<br /><br /> zozo=# create table t1 (id serial primary key, i1 int4, i2 int4, t1text, t2 text);<br /> CREATE TABLE<br /> zozo=# insert into t1 (i1, i2, t1, t2) values (1, 1, 'a', 'a');<br /> INSERT 01<br /> zozo=# insert into t1 (i1, i2, t1, t2) values (2, 2, 'b', 'b');<br /> INSERT 0 1<br /> zozo=# insert into t1 (i1,i2, t1, t2) values (3, 3, 'c', 'c');<br /> INSERT 0 1<br /> zozo=# select * from t1;<br /> id | i1 | i2 | t1 | t2 <br/> ----+----+----+----+----<br /> 1 | 1 | 1 | a | a<br /> 2 | 2 | 2 | b | b<br /> 3 | 3 | 3 | c | c<br/> (3 rows)<br /><br /> zozo=# begin;<br /> BEGIN<br /> zozo=# update t1 set i2 = i2*2, t2 = t2 || 'x2' where id = 2returning before.i1, after.i1, before.i2, after.i2, before.t1, after.t1, before.t2, after.t2;<br /> i1 | i1 | i2 | i2 |t1 | t1 | t2 | t2 <br /> ----+----+----+----+----+----+----+-----<br /> 2 | 2 | 2 | 4 | b | b | b | bx2<br /> (1row)<br /><br /> UPDATE 1<br /> zozo=# update t1 set i1 = i1 * 3, i2 = i2*2, t1 = t1 || 'x3', t2 = t2 || 'x2' where id= 3 returning before.i1, before.i2, after.i1, after.i2, before.t1, before.t2, after.t1, after.t2; i1 | i2 | i1 | i2 | t1| t2 | t1 | t2 <br /> ----+----+----+----+----+----+-----+-----<br /> 3 | 3 | 9 | 6 | c | c | cx3 | cx2<br />(1 row)<br /><br /> UPDATE 1<br /><br /><br /><br /> * Does the feature work as advertised?<br /><br /> Yes.<br /><br />* Are there corner cases the author has failed to consider?<br /><br /> I don't know.<br /><br /> * Are there any assertionfailures or crashes?<br /><br /> No.<br /><br /> * Does the patch slow down simple tests?<br /><br /> No.<br /><br/> * If it claims to improve performance, does it?<br /><br /> n/a<br /><br /> * Does it slow down other things? <br/><br /> No.<br /><br /> * Does it follow the project coding guidelines?<br /><br /> Mostly.<br /><br /> In the src/test/regress/parallel_schedulecontains an extra<br /> new line at the end, it shouldn't.<br /><br /> In b/src/backend/optimizer/plan/setrefs.c:<br/><br /> +static void bind_returning_variables(List *rlist, int bef, int aft);<br/><br /> but later it becomes not public:<br /><br /> + */<br /> +void bind_returning_variables(List *rlist, intbef, int aft)<br /> +{<br /><br /> Strange, but GCC 4.8.1 -Wall doesn't catch it. But the forward<br /> declaration isnot needed, the function is called only from<br /> later functions.<br /><br /> Similar for parse_clause.c:<br /><br />+extern void addAliases(ParseState *pstate);<br /> <br /> +void addAliases(ParseState *pstate)<br /><br /> This externaldeclaration is not needed since it is already<br /> in src/include/parser/parse_clause.h<br /><br /> In setrefs.c,bind_returning_variables() I would also rename<br /> the function arguments, so "before" and "after" are spelledout.<br /> These are not C keywords.<br /><br /> Some assignments, like:<br /><br /> + var=(Var*)tle;<br/> and<br /> + int index_rel=1;<br /><br /> in setrefs.c need spaces.<br /><br /> "if()" statementsneed a space before the "(" and not after.<br /><br /> Add spaces in the {} list in addAliases():<br /> + char *aliases[] = {"before","after"};<br /> like this: { "before", "after" }<br /><br /> Spaces are needed here,too:<br /> + for(i=0 ; i<noal; i++)<br /><br /> This "noal" should be "naliases" or "n_aliases" if you reallywant<br /> a variable. I would simply use the constant "2" for the two for()<br /> loops in addAliases() instead, itspurpose is obvious enough.<br /><br /> In setrefs.c, bind_returning_variables():<br /> + Var *var = NULL;<br />+ foreach(temp, rlist){<br /> Add an empty line after the declaration block.<br /><br /><br /> * Are there portabilityissues?<br /><br /> No.<br /><br /> * Will it work on Windows/BSD etc?<br /><br /> Yes.<br /><br /> * Are thecomments sufficient and accurate?<br /><br /><br /><br /> * Does it do what it says, correctly?<br /><br /> Yes.<br /><br/> * Does it produce compiler warnings?<br /><br /> No.<br /><br /> * Can you make it crash?<br /><br /> No.<br /><br/> * Is everything done in a way that fits together coherently with other features/modules?<br /><br /> I think so,mostly. Comments below.<br /><br /> * Are there interdependencies that can cause problems?<br /><br /> I don't think so.<br/><br /> Other comments:<br /><br /> This #define in pg_class:<br /><br /> diff --git a/src/include/catalog/pg_class.hb/src/include/catalog/pg_class.h<br /> index 49c4f6f..1b09994 100644<br /> --- a/src/include/catalog/pg_class.h<br/> +++ b/src/include/catalog/pg_class.h<br /> @@ -154,6 +154,7 @@ DESCR("");<br /> #define RELKIND_COMPOSITE_TYPE 'c' /* composite type */<br /> #define RELKIND_FOREIGN_TABLE 'f' /* foreign table */<br /> #define RELKIND_MATVIEW 'm' /* materialized view */<br /> +#define RELKIND_BEFORE 'b' /* virtual table for before/after statements */<br /> <br /> #define RELPERSISTENCE_PERMANENT 'p' /* regular table */<br /> #define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */<br /><br /> The "RELKIND_*"values all show up in the pg_class table except<br /> this new one. I don't think pg_class.h should be modifiedat all.<br /> addAliases() should use RELKIND_RELATION together with<br /> RTE_BEFORE. Then checks like:<br /><br/> + if (rte->relkind == RELKIND_BEFORE)<br /> + continue;<br /><br /> shouldbecome<br /><br /> + if (rte->relkind == RELKIND_RELATION && rte->rtekind == RTE_BEFORE)<br/> + continue;<br /><br /> Speaking of which, RTE_BEFORE is more properly named<br />RTE_RETURNING_ALIAS or something like that because it<br /> covers both "before" and "after". Someone may have a better<br/> idea for naming this symbol.<br /><br /> I feel like I understand what the code does and it looks sane to me.<br/><br /> One question, though, about this part:<br /><br /> ----------------------------------------<br /> @@ -1900,7+1954,27 @@ set_returning_clause_references(PlannerInfo *root,<br /> int rtoffset)<br /> {<br /> indexed_tlist *itlist;<br/> + int after_index=0, before_index=0;<br /> + Query *parse = root->parse;<br /> <br />+ ListCell *rt;<br /> + RangeTblEntry *bef;<br /> +<br /> + int index_rel=1;<br /> +<br /> + foreach(rt,parse->rtable)<br /> + {<br /> + bef = (RangeTblEntry *)lfirst(rt);<br /> + if(strcmp(bef->eref->aliasname,"after") == 0 && bef->rtekind == RTE_BEFORE )<br /> + {<br /> + after_index = index_rel;<br /> + }<br /> + if(strcmp(bef->eref->aliasname,"before")== 0 && bef->rtekind == RTE_BEFORE )<br /> + {<br/> + before_index = index_rel;<br /> + }<br /> + index_rel++;<br />+ }<br /> /*<br /> * We can perform the desired Var fixup by abusing the fix_join_expr<br /> * machinery that formerly handled inner indexscan fixup. We search the<br /> @@ -1924,6 +1998,7 @@ set_returning_clause_references(PlannerInfo*root,<br /> resultRelation,<br/> rtoffset);<br /> <br /> + bind_returning_variables(rlist,before_index, after_index);<br /> pfree(itlist);<br /> <br /> return rlist;<br/> ----------------------------------------<br /><br /> Why is it enough to keep the last before_index and after_indexvalues?<br /> What if there are more than one matching RangeTblEntry for "before"<br /> and/or for "after"? Isit an error condition or of them should be fixed?<br /><br /> I think that's all for now.<br /><br /> Best regards,<br/> Zoltán Böszörményi<br /><br /><pre class="moz-signature" cols="90">-- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de">http://www.postgresql-support.de</a> <aclass="moz-txt-link-freetext" href="http://www.postgresql.at/">http://www.postgresql.at/</a> </pre>
<div class="moz-cite-prefix">2013-08-21 19:17 keltezéssel, Boszormenyi Zoltan írta:<br /></div><blockquote cite="mid:5214F63F.5050608@cybertec.at"type="cite"><div class="moz-cite-prefix">Hi,<br /><br /> 2013-08-20 21:06 keltezéssel,Karol Trzcionka írta:<br /></div><blockquote cite="mid:5213BE1D.1000705@gmail.com" type="cite"><div class="moz-cite-prefix">Wdniu 20.08.2013 20:55, Boszormenyi Zoltan pisze:<br /></div><blockquote cite="mid:5213BB9D.4020204@cybertec.at"type="cite"> Here's a new one, for v7:<br /><br /> setrefs.c: In function ‘set_plan_refs’:<br/> setrefs.c:2001:26: warning: ‘before_index’ may be used uninitialized in this function [-Wmaybe-uninitialized]<br/> bind_returning_variables(rlist, before_index, after_index);<br /> ^<br /> setrefs.c:1957:21: note: ‘before_index’ was declared here<br /> int after_index=0, before_index;<br/> ^<br /></blockquote> Right, my mistake. Sorry and thanks. Fixed.<br /> Regards,<br/> Karol Trzcionka<br /></blockquote><br /> With this fixed, a more complete review:<br /><br /> * Is the patchin a patch format which has context? (eg: context diff format)<br /><br /> Yes.<br /><br /> * Does it apply cleanlyto the current git master?<br /><br /> Yes.<br /><br /> * Does it include reasonable tests, necessary doc patches,etc? <br /><br /> There is a new regression test (returning_before_after.sql) covering<br /> this feature. However,I think it should be added to the group<br /> where "returning.sql" resides currently. There is a value in runningit<br /> in parallel to other tests. Sometimes a subtle bug is uncovered<br /> because of this and your v2 patch fixedsuch a bug already.<br /><br /> doc/src/sgml/ref/update.sgml describes this feature.<br /><br /> doc/src/sgml/dml.sgmlshould also be touched to cover this feature.<br /><br /> * Does the patch actually implement what it'ssupposed to do?<br /><br /> Yes.<br /><br /> * Do we want that?<br /><br /> Yes.<br /><br /> * Do we already have it?<br/><br /> No.<br /><br /> * Does it follow SQL spec, or the community-agreed behavior?<br /><br /> RETURNING is a PostgreSQLextension, so the SQL-spec part<br /> of the question isn't applicable.<br /><br /> It implements the community-agreedbehaviour, according to<br /> the new regression test coverage.<br /><br /> * Does it include pg_dump support(if applicable)?<br /><br /> n/a<br /><br /> * Are there dangers?<br /><br /> I don't think so.<br /><br /> * Haveall the bases been covered?<br /><br /> It seems so. I have also tried mixing before/after columns in<br /> differentorders and the query didn't fail:<br /><br /> zozo=# create table t1 (id serial primary key, i1 int4, i2 int4, t1text, t2 text);<br /> CREATE TABLE<br /> zozo=# insert into t1 (i1, i2, t1, t2) values (1, 1, 'a', 'a');<br /> INSERT 01<br /> zozo=# insert into t1 (i1, i2, t1, t2) values (2, 2, 'b', 'b');<br /> INSERT 0 1<br /> zozo=# insert into t1 (i1,i2, t1, t2) values (3, 3, 'c', 'c');<br /> INSERT 0 1<br /> zozo=# select * from t1;<br /> id | i1 | i2 | t1 | t2 <br/> ----+----+----+----+----<br /> 1 | 1 | 1 | a | a<br /> 2 | 2 | 2 | b | b<br /> 3 | 3 | 3 | c | c<br/> (3 rows)<br /><br /> zozo=# begin;<br /> BEGIN<br /> zozo=# update t1 set i2 = i2*2, t2 = t2 || 'x2' where id = 2returning before.i1, after.i1, before.i2, after.i2, before.t1, after.t1, before.t2, after.t2;<br /> i1 | i1 | i2 | i2 |t1 | t1 | t2 | t2 <br /> ----+----+----+----+----+----+----+-----<br /> 2 | 2 | 2 | 4 | b | b | b | bx2<br /> (1row)<br /><br /> UPDATE 1<br /> zozo=# update t1 set i1 = i1 * 3, i2 = i2*2, t1 = t1 || 'x3', t2 = t2 || 'x2' where id= 3 returning before.i1, before.i2, after.i1, after.i2, before.t1, before.t2, after.t1, after.t2; i1 | i2 | i1 | i2 | t1| t2 | t1 | t2 <br /> ----+----+----+----+----+----+-----+-----<br /> 3 | 3 | 9 | 6 | c | c | cx3 | cx2<br />(1 row)<br /><br /> UPDATE 1<br /><br /><br /><br /> * Does the feature work as advertised?<br /><br /> Yes.<br /><br />* Are there corner cases the author has failed to consider?<br /><br /> I don't know.<br /><br /> * Are there any assertionfailures or crashes?<br /><br /> No.<br /><br /> * Does the patch slow down simple tests?<br /><br /> No.<br /><br/> * If it claims to improve performance, does it?<br /><br /> n/a<br /><br /> * Does it slow down other things? <br/><br /> No.<br /><br /> * Does it follow the project coding guidelines?<br /><br /> Mostly.<br /><br /> In the src/test/regress/parallel_schedulecontains an extra<br /> new line at the end, it shouldn't.<br /><br /> In b/src/backend/optimizer/plan/setrefs.c:<br/><br /> +static void bind_returning_variables(List *rlist, int bef, int aft);<br/><br /> but later it becomes not public:<br /><br /> + */<br /> +void bind_returning_variables(List *rlist, intbef, int aft)<br /> +{<br /><br /> Strange, but GCC 4.8.1 -Wall doesn't catch it. But the forward<br /> declaration isnot needed, the function is called only from<br /> later functions.<br /><br /> Similar for parse_clause.c:<br /><br />+extern void addAliases(ParseState *pstate);<br /> <br /> +void addAliases(ParseState *pstate)<br /><br /> This externaldeclaration is not needed since it is already<br /> in src/include/parser/parse_clause.h<br /><br /> In setrefs.c,bind_returning_variables() I would also rename<br /> the function arguments, so "before" and "after" are spelledout.<br /> These are not C keywords.<br /><br /> Some assignments, like:<br /><br /> + var=(Var*)tle;<br/> and<br /> + int index_rel=1;<br /><br /> in setrefs.c need spaces.<br /><br /> "if()" statementsneed a space before the "(" and not after.<br /><br /> Add spaces in the {} list in addAliases():<br /> + char *aliases[] = {"before","after"};<br /> like this: { "before", "after" }<br /><br /> Spaces are needed here,too:<br /> + for(i=0 ; i<noal; i++)<br /><br /> This "noal" should be "naliases" or "n_aliases" if you reallywant<br /> a variable. I would simply use the constant "2" for the two for()<br /> loops in addAliases() instead, itspurpose is obvious enough.<br /><br /> In setrefs.c, bind_returning_variables():<br /> + Var *var = NULL;<br />+ foreach(temp, rlist){<br /> Add an empty line after the declaration block.<br /><br /><br /> * Are there portabilityissues?<br /><br /> No.<br /><br /> * Will it work on Windows/BSD etc?<br /><br /> Yes.<br /><br /> * Are thecomments sufficient and accurate?<br /></blockquote><br /> There should be more comments, especially regarding<br /> myquestion at the end.<br /><br /><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"><br /><br /><br /> * Doesit do what it says, correctly?<br /><br /> Yes.<br /><br /> * Does it produce compiler warnings?<br /><br /> No.<br /><br/> * Can you make it crash?<br /><br /> No.<br /><br /> * Is everything done in a way that fits together coherentlywith other features/modules?<br /><br /> I think so, mostly. Comments below.<br /><br /> * Are there interdependenciesthat can cause problems?<br /><br /> I don't think so.<br /><br /> Other comments:<br /><br /> This #definein pg_class:<br /><br /> diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h<br /> index49c4f6f..1b09994 100644<br /> --- a/src/include/catalog/pg_class.h<br /> +++ b/src/include/catalog/pg_class.h<br />@@ -154,6 +154,7 @@ DESCR("");<br /> #define RELKIND_COMPOSITE_TYPE 'c' /* composite type*/<br /> #define RELKIND_FOREIGN_TABLE 'f' /* foreign table */<br /> #define RELKIND_MATVIEW 'm' /* materialized view */<br /> +#define RELKIND_BEFORE 'b' /* virtual table for before/after statements */<br/> <br /> #define RELPERSISTENCE_PERMANENT 'p' /* regular table */<br /> #define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */<br /><br /> The "RELKIND_*"values all show up in the pg_class table except<br /> this new one. I don't think pg_class.h should be modifiedat all.<br /> addAliases() should use RELKIND_RELATION together with<br /> RTE_BEFORE. Then checks like:<br /><br/> + if (rte->relkind == RELKIND_BEFORE)<br /> + continue;<br /><br /> shouldbecome<br /><br /> + if (rte->relkind == RELKIND_RELATION && rte->rtekind == RTE_BEFORE)<br/> + continue;<br /></blockquote><br /> Thinking about it more,<br /> if (rte->rtekind == RTE_BEFORE)<br /> would be enough, as no other kinds of rte's can have rtekind== RTE_BEFORE.<br /><br /><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"><br /> Speaking of which,RTE_BEFORE is more properly named<br /> RTE_RETURNING_ALIAS or something like that because it<br /> covers both "before"and "after". Someone may have a better<br /> idea for naming this symbol.<br /><br /> I feel like I understand whatthe code does and it looks sane to me.<br /><br /> One question, though, about this part:<br /><br /> ----------------------------------------<br/> @@ -1900,7 +1954,27 @@ set_returning_clause_references(PlannerInfo *root,<br/> int rtoffset)<br /> {<br /> indexed_tlist*itlist;<br /> + int after_index=0, before_index=0;<br /> + Query *parse = root->parse;<br/> <br /> + ListCell *rt;<br /> + RangeTblEntry *bef;<br /> +<br /> + int index_rel=1;<br/> +<br /> + foreach(rt,parse->rtable)<br /> + {<br /> + bef = (RangeTblEntry*)lfirst(rt);<br /> + if(strcmp(bef->eref->aliasname,"after") == 0 && bef->rtekind== RTE_BEFORE )<br /> + {<br /> + after_index = index_rel;<br /> + }<br /> + if(strcmp(bef->eref->aliasname,"before") == 0 && bef->rtekind ==RTE_BEFORE )<br /> + {<br /> + before_index = index_rel;<br /> + }<br/> + index_rel++;<br /> + }<br /> /*<br /> * We can perform the desired Var fixupby abusing the fix_join_expr<br /> * machinery that formerly handled inner indexscan fixup. We search the<br/> @@ -1924,6 +1998,7 @@ set_returning_clause_references(PlannerInfo *root,<br /> resultRelation,<br /> rtoffset);<br /> <br /> + bind_returning_variables(rlist, before_index,after_index);<br /> pfree(itlist);<br /> <br /> return rlist;<br /> ----------------------------------------<br/><br /> Why is it enough to keep the last before_index and after_index values?<br/> What if there are more than one matching RangeTblEntry for "before"<br /> and/or for "after"? Is it an errorcondition or of them should be fixed?<br /><br /> I think that's all for now.<br /><br /> Best regards,<br /> ZoltánBöszörményi<br /><br /><pre class="moz-signature" cols="90">-- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de" moz-do-not-send="true">http://www.postgresql-support.de</a> <a class="moz-txt-link-freetext" href="http://www.postgresql.at/"moz-do-not-send="true">http://www.postgresql.at/</a> </pre></blockquote><br /><br /><pre class="moz-signature" cols="90">-- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de">http://www.postgresql-support.de</a> <aclass="moz-txt-link-freetext" href="http://www.postgresql.at/">http://www.postgresql.at/</a> </pre>
<div class="moz-cite-prefix">2013-08-21 20:00 keltezéssel, Boszormenyi Zoltan írta:<br /></div><blockquote cite="mid:52150052.3010104@cybertec.at"type="cite"><div class="moz-cite-prefix">2013-08-21 19:17 keltezéssel, BoszormenyiZoltan írta:<br /></div><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"><div class="moz-cite-prefix">Hi,<br/><br /> 2013-08-20 21:06 keltezéssel, Karol Trzcionka írta:<br /></div><blockquote cite="mid:5213BE1D.1000705@gmail.com"type="cite"><div class="moz-cite-prefix">W dniu 20.08.2013 20:55, Boszormenyi Zoltanpisze:<br /></div><blockquote cite="mid:5213BB9D.4020204@cybertec.at" type="cite"> Here's a new one, for v7:<br /><br/> setrefs.c: In function ‘set_plan_refs’:<br /> setrefs.c:2001:26: warning: ‘before_index’ may be used uninitializedin this function [-Wmaybe-uninitialized]<br /> bind_returning_variables(rlist, before_index, after_index);<br/> ^<br /> setrefs.c:1957:21: note: ‘before_index’ was declared here<br /> intafter_index=0, before_index;<br /> ^<br /></blockquote> Right, my mistake. Sorry and thanks. Fixed.<br/> Regards,<br /> Karol Trzcionka<br /></blockquote><br /> With this fixed, a more complete review:<br /><br />* Is the patch in a patch format which has context? (eg: context diff format)<br /><br /> Yes.<br /><br /> * Does it applycleanly to the current git master?<br /><br /> Yes.<br /><br /> * Does it include reasonable tests, necessary doc patches,etc? <br /><br /> There is a new regression test (returning_before_after.sql) covering<br /> this feature. However,I think it should be added to the group<br /> where "returning.sql" resides currently. There is a value in runningit<br /> in parallel to other tests. Sometimes a subtle bug is uncovered<br /> because of this and your v2 patch fixedsuch a bug already.<br /><br /> doc/src/sgml/ref/update.sgml describes this feature.<br /><br /> doc/src/sgml/dml.sgmlshould also be touched to cover this feature.<br /><br /> * Does the patch actually implement what it'ssupposed to do?<br /><br /> Yes.<br /><br /> * Do we want that?<br /><br /> Yes.<br /><br /> * Do we already have it?<br/><br /> No.<br /><br /> * Does it follow SQL spec, or the community-agreed behavior?<br /><br /> RETURNING is a PostgreSQLextension, so the SQL-spec part<br /> of the question isn't applicable.<br /><br /> It implements the community-agreedbehaviour, according to<br /> the new regression test coverage.<br /><br /> * Does it include pg_dump support(if applicable)?<br /><br /> n/a<br /><br /> * Are there dangers?<br /><br /> I don't think so.<br /><br /> * Haveall the bases been covered?<br /><br /> It seems so. I have also tried mixing before/after columns in<br /> differentorders and the query didn't fail:<br /><br /> zozo=# create table t1 (id serial primary key, i1 int4, i2 int4, t1text, t2 text);<br /> CREATE TABLE<br /> zozo=# insert into t1 (i1, i2, t1, t2) values (1, 1, 'a', 'a');<br /> INSERT 01<br /> zozo=# insert into t1 (i1, i2, t1, t2) values (2, 2, 'b', 'b');<br /> INSERT 0 1<br /> zozo=# insert into t1 (i1,i2, t1, t2) values (3, 3, 'c', 'c');<br /> INSERT 0 1<br /> zozo=# select * from t1;<br /> id | i1 | i2 | t1 | t2 <br/> ----+----+----+----+----<br /> 1 | 1 | 1 | a | a<br /> 2 | 2 | 2 | b | b<br /> 3 | 3 | 3 | c | c<br/> (3 rows)<br /><br /> zozo=# begin;<br /> BEGIN<br /> zozo=# update t1 set i2 = i2*2, t2 = t2 || 'x2' where id = 2returning before.i1, after.i1, before.i2, after.i2, before.t1, after.t1, before.t2, after.t2;<br /> i1 | i1 | i2 | i2 |t1 | t1 | t2 | t2 <br /> ----+----+----+----+----+----+----+-----<br /> 2 | 2 | 2 | 4 | b | b | b | bx2<br /> (1row)<br /><br /> UPDATE 1<br /> zozo=# update t1 set i1 = i1 * 3, i2 = i2*2, t1 = t1 || 'x3', t2 = t2 || 'x2' where id= 3 returning before.i1, before.i2, after.i1, after.i2, before.t1, before.t2, after.t1, after.t2; i1 | i2 | i1 | i2 | t1| t2 | t1 | t2 <br /> ----+----+----+----+----+----+-----+-----<br /> 3 | 3 | 9 | 6 | c | c | cx3 | cx2<br />(1 row)<br /><br /> UPDATE 1<br /><br /><br /><br /> * Does the feature work as advertised?<br /><br /> Yes.<br /><br />* Are there corner cases the author has failed to consider?<br /><br /> I don't know.<br /><br /> * Are there any assertionfailures or crashes?<br /><br /> No.<br /><br /> * Does the patch slow down simple tests?<br /><br /> No.<br /><br/> * If it claims to improve performance, does it?<br /><br /> n/a<br /><br /> * Does it slow down other things? <br/><br /> No.<br /><br /> * Does it follow the project coding guidelines?<br /><br /> Mostly.<br /><br /> In the src/test/regress/parallel_schedulecontains an extra<br /> new line at the end, it shouldn't.<br /><br /> In b/src/backend/optimizer/plan/setrefs.c:<br/><br /> +static void bind_returning_variables(List *rlist, int bef, int aft);<br/><br /> but later it becomes not public:<br /><br /> + */<br /> +void bind_returning_variables(List *rlist, intbef, int aft)<br /> +{<br /><br /> Strange, but GCC 4.8.1 -Wall doesn't catch it. But the forward<br /> declaration isnot needed, the function is called only from<br /> later functions.<br /><br /> Similar for parse_clause.c:<br /><br />+extern void addAliases(ParseState *pstate);<br /> <br /> +void addAliases(ParseState *pstate)<br /><br /> This externaldeclaration is not needed since it is already<br /> in src/include/parser/parse_clause.h<br /><br /> In setrefs.c,bind_returning_variables() I would also rename<br /> the function arguments, so "before" and "after" are spelledout.<br /> These are not C keywords.<br /><br /> Some assignments, like:<br /><br /> + var=(Var*)tle;<br/> and<br /> + int index_rel=1;<br /><br /> in setrefs.c need spaces.<br /><br /> "if()" statementsneed a space before the "(" and not after.<br /><br /> Add spaces in the {} list in addAliases():<br /> + char *aliases[] = {"before","after"};<br /> like this: { "before", "after" }<br /><br /> Spaces are needed here,too:<br /> + for(i=0 ; i<noal; i++)<br /><br /> This "noal" should be "naliases" or "n_aliases" if you reallywant<br /> a variable. I would simply use the constant "2" for the two for()<br /> loops in addAliases() instead, itspurpose is obvious enough.<br /><br /> In setrefs.c, bind_returning_variables():<br /> + Var *var = NULL;<br />+ foreach(temp, rlist){<br /> Add an empty line after the declaration block.<br /><br /><br /> * Are there portabilityissues?<br /><br /> No.<br /><br /> * Will it work on Windows/BSD etc?<br /><br /> Yes.<br /><br /> * Are thecomments sufficient and accurate?<br /></blockquote><br /> There should be more comments, especially regarding<br /> myquestion at the end.<br /><br /><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"><br /><br /><br /> * Doesit do what it says, correctly?<br /><br /> Yes.<br /><br /> * Does it produce compiler warnings?<br /><br /> No.<br /><br/> * Can you make it crash?<br /><br /> No.<br /><br /> * Is everything done in a way that fits together coherentlywith other features/modules?<br /><br /> I think so, mostly. Comments below.<br /><br /> * Are there interdependenciesthat can cause problems?<br /><br /> I don't think so.<br /><br /> Other comments:<br /><br /> This #definein pg_class:<br /><br /> diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h<br /> index49c4f6f..1b09994 100644<br /> --- a/src/include/catalog/pg_class.h<br /> +++ b/src/include/catalog/pg_class.h<br />@@ -154,6 +154,7 @@ DESCR("");<br /> #define RELKIND_COMPOSITE_TYPE 'c' /* composite type*/<br /> #define RELKIND_FOREIGN_TABLE 'f' /* foreign table */<br /> #define RELKIND_MATVIEW 'm' /* materialized view */<br /> +#define RELKIND_BEFORE 'b' /* virtual table for before/after statements */<br/> <br /> #define RELPERSISTENCE_PERMANENT 'p' /* regular table */<br /> #define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */<br /><br /> The "RELKIND_*"values all show up in the pg_class table except<br /> this new one. I don't think pg_class.h should be modifiedat all.<br /> addAliases() should use RELKIND_RELATION together with<br /> RTE_BEFORE. Then checks like:<br /><br/> + if (rte->relkind == RELKIND_BEFORE)<br /> + continue;<br /><br /> shouldbecome<br /><br /> + if (rte->relkind == RELKIND_RELATION && rte->rtekind == RTE_BEFORE)<br/> + continue;<br /></blockquote><br /> Thinking about it more,<br /> if (rte->rtekind == RTE_BEFORE)<br /> would be enough, as no other kinds of rte's can have rtekind== RTE_BEFORE.<br /><br /><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"><br /> Speaking of which,RTE_BEFORE is more properly named<br /> RTE_RETURNING_ALIAS or something like that because it<br /> covers both "before"and "after". Someone may have a better<br /> idea for naming this symbol.<br /><br /> I feel like I understand whatthe code does and it looks sane to me.<br /><br /> One question, though, about this part:<br /><br /> ----------------------------------------<br/> @@ -1900,7 +1954,27 @@ set_returning_clause_references(PlannerInfo *root,<br/> int rtoffset)<br /> {<br /> indexed_tlist*itlist;<br /> + int after_index=0, before_index=0;<br /> + Query *parse = root->parse;<br/> <br /> + ListCell *rt;<br /> + RangeTblEntry *bef;<br /> +<br /> + int index_rel=1;<br/> +<br /> + foreach(rt,parse->rtable)<br /> + {<br /> + bef = (RangeTblEntry*)lfirst(rt);<br /> + if(strcmp(bef->eref->aliasname,"after") == 0 && bef->rtekind== RTE_BEFORE )<br /> + {<br /> + after_index = index_rel;<br /> + }<br /> + if(strcmp(bef->eref->aliasname,"before") == 0 && bef->rtekind ==RTE_BEFORE )<br /> + {<br /> + before_index = index_rel;<br /> + }<br/> + index_rel++;<br /> + }<br /> /*<br /> * We can perform the desired Var fixupby abusing the fix_join_expr<br /> * machinery that formerly handled inner indexscan fixup. We search the<br/> @@ -1924,6 +1998,7 @@ set_returning_clause_references(PlannerInfo *root,<br /> resultRelation,<br /> rtoffset);<br /> <br /> + bind_returning_variables(rlist, before_index,after_index);<br /> pfree(itlist);<br /> <br /> return rlist;<br /> ----------------------------------------<br/><br /> Why is it enough to keep the last before_index and after_index values?<br/> What if there are more than one matching RangeTblEntry for "before"<br /> and/or for "after"? Is it an errorcondition or of them should be fixed?<br /></blockquote></blockquote><br /> Since addAliases() only adds a single onefor each and only for an<br /> UPDATE ... RETURNING query, it is okay. Also, because<br /> set_returning_clause_references()is called separately for each<br /> query with RETURNING. Anyway, a comment before the new<br/> foreach() loop in set_returning_clause_references() should explain<br /> the fact that only one of each ("before"and "after") can occur for<br /> such a query.<br /><br /> I have just tried this:<br /><br /> before as (updatet1 set i1 = i1 * 2, i2 = i2 * 3, t1 = t1 || 'x2', t2 = t2 || 'x3'<br /> where id = 1 returning before.i1,after.i1, before.i2, after.i2),<br /> after as (update t1 set i1 = i1 * 2, i2 = i2 * 3, t1 = t1 || 'x2', t2 = t2|| 'x3'<br /> where id = 2 returning before.i1, after.i1, before.i2, after.i2)<br /> select * from (select * frombefore union all select * from after) as x;<br /><br /> and it gave me the proper results, no crash.<br /><br /> AboutaddAliases():<br /> - it can be moved to parser/analyze.c so it can be static.<br /> - addReturningAliases() may bea better name for the function.<br /><br /><blockquote cite="mid:52150052.3010104@cybertec.at" type="cite"><blockquotecite="mid:5214F63F.5050608@cybertec.at" type="cite"><br /> I think that's all for now.<br /><br />Best regards,<br /> Zoltán Böszörményi<br /><br /><pre class="moz-signature" cols="90">-- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de" moz-do-not-send="true">http://www.postgresql-support.de</a> <a class="moz-txt-link-freetext" href="http://www.postgresql.at/"moz-do-not-send="true">http://www.postgresql.at/</a> </pre></blockquote><br /><br /><pre class="moz-signature" cols="90">-- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de" moz-do-not-send="true">http://www.postgresql-support.de</a> <a class="moz-txt-link-freetext" href="http://www.postgresql.at/"moz-do-not-send="true">http://www.postgresql.at/</a> </pre></blockquote><br /><br /><pre class="moz-signature" cols="90">-- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de">http://www.postgresql-support.de</a> <aclass="moz-txt-link-freetext" href="http://www.postgresql.at/">http://www.postgresql.at/</a> </pre>
W dniu 21.08.2013 19:17, Boszormenyi Zoltan pisze:
With this fixed, a more complete review:Thanks.
There is a new regression test (returning_before_after.sql) coveringModified to work correct in parallel testing
this feature. However, I think it should be added to the group
where "returning.sql" resides currently. There is a value in running it
in parallel to other tests. Sometimes a subtle bug is uncovered
because of this and your v2 patch fixed such a bug already.
doc/src/sgml/ref/update.sgml describes this feature.I don't think so, there is not any info about returning feature, I think it shouldn't be part of my patch.
doc/src/sgml/dml.sgml should also be touched to cover this feature.
In the src/test/regress/parallel_schedule contains an extraFixed
new line at the end, it shouldn't.
I've change to static in the definition.
In b/src/backend/optimizer/plan/setrefs.c:
+static void bind_returning_variables(List *rlist, int bef, int aft);
but later it becomes not public:
+ */
+void bind_returning_variables(List *rlist, int bef, int aft)
+{
Removed.
+extern void addAliases(ParseState *pstate);
+void addAliases(ParseState *pstate)
This external declaration is not needed since it is already
in src/include/parser/parse_clause.h
Changed.
In setrefs.c, bind_returning_variables() I would also rename
the function arguments, so "before" and "after" are spelled out.
These are not C keywords.
Some assignments, like:Added some whitespaces.
+ var=(Var*)tle;
and
+ int index_rel=1;
in setrefs.c need spaces.
"if()" statements need a space before the "(" and not after.
Add spaces in the {} list in addAliases():
+ char *aliases[] = {"before","after"};
like this: { "before", "after" }
Spaces are needed here, too:
+ for(i=0 ; i<noal; i++)
This "noal" should be "naliases" or "n_aliases" if you really want
a variable. I would simply use the constant "2" for the two for()
loops in addAliases() instead, its purpose is obvious enough.
In setrefs.c, bind_returning_variables():Added.
+ Var *var = NULL;
+ foreach(temp, rlist){
Add an empty line after the declaration block.
Other comments:RELKIND_BEFORE removed - it probably left over previous work and/or I needed it because RTE_BEFORE wasn't available (not included?).
This #define in pg_class:
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 49c4f6f..1b09994 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -154,6 +154,7 @@ DESCR("");
#define RELKIND_COMPOSITE_TYPE 'c' /* composite type */
#define RELKIND_FOREIGN_TABLE 'f' /* foreign table */
#define RELKIND_MATVIEW 'm' /* materialized view */
+#define RELKIND_BEFORE 'b' /* virtual table for before/after statements */
#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */
Speaking of which, RTE_BEFORE is more properly namedRenamed to RTE_ALIAS - similar to addAliases (not addReturningAliases)
RTE_RETURNING_ALIAS or something like that because it
covers both "before" and "after". Someone may have a better
idea for naming this symbol.
One question, though, about this part:I think it is safe, it is the first and the last index. On each level of statement there can be (at most) the only one "before" and one "after" alias.
----------------------------------------
@@ -1900,7 +1954,27 @@ set_returning_clause_references(PlannerInfo *root,
int rtoffset)
{
indexed_tlist *itlist;
+ int after_index=0, before_index=0;
+ Query *parse = root->parse;
+ ListCell *rt;
+ RangeTblEntry *bef;
+
+ int index_rel=1;
+
+ foreach(rt,parse->rtable)
+ {
+ bef = (RangeTblEntry *)lfirst(rt);
+ if(strcmp(bef->eref->aliasname,"after") == 0 && bef->rtekind == RTE_BEFORE )
+ {
+ after_index = index_rel;
+ }
+ if(strcmp(bef->eref->aliasname,"before") == 0 && bef->rtekind == RTE_BEFORE )
+ {
+ before_index = index_rel;
+ }
+ index_rel++;
+ }
/*
* We can perform the desired Var fixup by abusing the fix_join_expr
* machinery that formerly handled inner indexscan fixup. We search the
@@ -1924,6 +1998,7 @@ set_returning_clause_references(PlannerInfo *root,
resultRelation,
rtoffset);
+ bind_returning_variables(rlist, before_index, after_index);
pfree(itlist);
return rlist;
----------------------------------------
Why is it enough to keep the last before_index and after_index values?
What if there are more than one matching RangeTblEntry for "before"
and/or for "after"? Is it an error condition or of them should be fixed?
Regards,
Karol Trzcionka
Вложения
<div class="moz-cite-prefix">Hi,<br /><br /> 2013-08-21 20:52 keltezéssel, Karol Trzcionka írta:<br /></div><blockquote cite="mid:52150C69.8030608@gmail.com"type="cite"><div class="moz-cite-prefix">W dniu 21.08.2013 19:17, Boszormenyi Zoltanpisze:<br /></div><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"> With this fixed, a more completereview:<br /></blockquote> Thanks.<br /><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"> There isa new regression test (returning_before_after.sql) covering<br /> this feature. However, I think it should be added tothe group<br /> where "returning.sql" resides currently. There is a value in running it<br /> in parallel to other tests.Sometimes a subtle bug is uncovered<br /> because of this and your v2 patch fixed such a bug already.<br /></blockquote>Modified to work correct in parallel testing<br /><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite">doc/src/sgml/ref/update.sgml describes this feature.<br /><br /> doc/src/sgml/dml.sgml should also be touchedto cover this feature.<br /></blockquote> I don't think so, there is not any info about returning feature, I thinkit shouldn't be part of my patch.<br /><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"> In the src/test/regress/parallel_schedulecontains an extra<br /> new line at the end, it shouldn't.<br /></blockquote> Fixed<br/><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"><br /> In b/src/backend/optimizer/plan/setrefs.c:<br/><br /> +static void bind_returning_variables(List *rlist, int bef, int aft);<br/><br /> but later it becomes not public:<br /><br /> + */<br /> +void bind_returning_variables(List *rlist, intbef, int aft)<br /> +{<br /></blockquote> I've change to static in the definition.<br /><blockquote cite="mid:5214F63F.5050608@cybertec.at"type="cite"><br /> +extern void addAliases(ParseState *pstate);<br /> <br /> +voidaddAliases(ParseState *pstate)<br /><br /> This external declaration is not needed since it is already<br /> in src/include/parser/parse_clause.h<br/></blockquote> Removed.<br /><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"><br/> In setrefs.c, bind_returning_variables() I would also rename<br /> the function arguments, so "before"and "after" are spelled out.<br /> These are not C keywords.<br /></blockquote> Changed.<br /><blockquote cite="mid:5214F63F.5050608@cybertec.at"type="cite"> Some assignments, like:<br /><br /> + var=(Var*)tle;<br/> and<br /> + int index_rel=1;<br /><br /> in setrefs.c need spaces.<br /><br /> "if()" statementsneed a space before the "(" and not after.<br /><br /> Add spaces in the {} list in addAliases():<br /> + char *aliases[] = {"before","after"};<br /> like this: { "before", "after" }<br /><br /> Spaces are needed here,too:<br /> + for(i=0 ; i<noal; i++)<br /><br /> This "noal" should be "naliases" or "n_aliases" if you reallywant<br /> a variable. I would simply use the constant "2" for the two for()<br /> loops in addAliases() instead, itspurpose is obvious enough.<br /></blockquote> Added some whitespaces.<br /><blockquote cite="mid:5214F63F.5050608@cybertec.at"type="cite"> In setrefs.c, bind_returning_variables():<br /> + Var *var = NULL;<br/> + foreach(temp, rlist){<br /> Add an empty line after the declaration block.<br /></blockquote> Added.<br/><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"> Other comments:<br /><br /> This #define in pg_class:<br/><br /> diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h<br /> index 49c4f6f..1b09994100644<br /> --- a/src/include/catalog/pg_class.h<br /> +++ b/src/include/catalog/pg_class.h<br /> @@ -154,6+154,7 @@ DESCR("");<br /> #define RELKIND_COMPOSITE_TYPE 'c' /* composite type */<br/> #define RELKIND_FOREIGN_TABLE 'f' /* foreign table */<br /> #define RELKIND_MATVIEW 'm' /* materialized view */<br /> +#define RELKIND_BEFORE 'b' /* virtual table for before/after statements */<br/> <br /> #define RELPERSISTENCE_PERMANENT 'p' /* regular table */<br /> #define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent table */<br /><br /></blockquote>RELKIND_BEFORE removed - it probably left over previous work and/or I needed it because RTE_BEFORE wasn'tavailable (not included?).<br /><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"> Speaking of which,RTE_BEFORE is more properly named<br /> RTE_RETURNING_ALIAS or something like that because it<br /> covers both "before"and "after". Someone may have a better<br /> idea for naming this symbol.<br /></blockquote> Renamed to RTE_ALIAS- similar to addAliases (not addReturningAliases)<br /></blockquote><br /> Others may have also a word in this topic,but consider that<br /> this is *not* a regular alias and for those, RTE_ALIAS is not used,<br /> like in<br /><br/> UPDATE table AS alias SET ...<br /><br /> Maybe RTE_RETURNING_ALIAS takes a little more typing, but<br /> itbecomes obvious when reading and new code won't confuse<br /> it with regular aliases.<br /><br /><blockquote cite="mid:52150C69.8030608@gmail.com"type="cite"><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"> One question,though, about this part:<br /><br /> ----------------------------------------<br /> @@ -1900,7 +1954,27 @@ set_returning_clause_references(PlannerInfo*root,<br /> intrtoffset)<br /> {<br /> indexed_tlist *itlist;<br /> + int after_index=0, before_index=0;<br /> + Query *parse = root->parse;<br /> <br /> + ListCell *rt;<br /> + RangeTblEntry *bef;<br />+<br /> + int index_rel=1;<br /> +<br /> + foreach(rt,parse->rtable)<br /> + {<br /> + bef = (RangeTblEntry *)lfirst(rt);<br /> + if(strcmp(bef->eref->aliasname,"after") ==0 && bef->rtekind == RTE_BEFORE )<br /> + {<br /> + after_index = index_rel;<br/> + }<br /> + if(strcmp(bef->eref->aliasname,"before") == 0 && bef->rtekind== RTE_BEFORE )<br /> + {<br /> + before_index = index_rel;<br /> + }<br /> + index_rel++;<br /> + }<br /> /*<br /> * We can perform thedesired Var fixup by abusing the fix_join_expr<br /> * machinery that formerly handled inner indexscan fixup. We search the<br /> @@ -1924,6 +1998,7 @@ set_returning_clause_references(PlannerInfo *root,<br /> resultRelation,<br /> rtoffset);<br /> <br /> + bind_returning_variables(rlist, before_index,after_index);<br /> pfree(itlist);<br /> <br /> return rlist;<br /> ----------------------------------------<br/><br /> Why is it enough to keep the last before_index and after_index values?<br/> What if there are more than one matching RangeTblEntry for "before"<br /> and/or for "after"? Is it an errorcondition or of them should be fixed?<br /></blockquote> I think it is safe, it is the first and the last index. Oneach level of statement there can be (at most) the only one "before" and one "after" alias.<br /></blockquote><br /> Ideduced it in the meantime. I still think it worth a comment<br /> in setrefs.c.<br /><br /> I think your v9 patch can belooked at by a more seasoned reviewer<br /> or a committer.<br /><br /> Best regards,<br /> Zoltán Böszörményi<br /><br/><blockquote cite="mid:52150C69.8030608@gmail.com" type="cite"> Regards,<br /> Karol Trzcionka<br /><br /><fieldsetclass="mimeAttachmentHeader"></fieldset><br /><pre wrap=""> </pre></blockquote><br /><br /><pre class="moz-signature" cols="90">-- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de">http://www.postgresql-support.de</a> <aclass="moz-txt-link-freetext" href="http://www.postgresql.at/">http://www.postgresql.at/</a> </pre>
I took a look at this patch today, and I'm pretty skeptical about whether it's on the right track. It adds a new kind of RTE called RTE_ALIAS, which doesn't seem particularly descriptive and alias is used elsewhere to mean something fairly different. More generally, I'm not convinced that adding a new type of RTE is the right way to handle this. The changes in pull_up_subqueries_recurse, pullup_replace_vars_callback, preprocess_targetlist, and build_joinrel_tlist seem like weird hacks. Those functions aren't directly related to this feature; why do they need to know about it? I wonder if we shouldn't be trying to handle resolution of these names at an earlier processing stage, closer to the processor. I notice that set_returning_clause_references() seems to have already solved the hard part of this problem, which is frobbing target list entries to return values from the new row rather than, as they naturally would, the old row. In fact, we can already get approximately the desired effect already: rhaas=# update foo as after set a = before.a + 1 from foo as before where before.a = after.a returning before.a, after.a;a | a ---+---1 | 2 (1 row) Now this is a hack, because we don't really want to add an extra scan/join just to get the behavior we want. But it seems to me significant that this processing makes Vars that refer to the target table refer to the new values, and if we got rid of it, they'd refer to the old values. Can't we contrive to make AFTER.x parse into the same Var node that x currently does? Then we don't need an RTE for it. And maybe BEFORE.x ought to parse to the same node that just plain x does but with some marking, or some other node wrapped around it (like a TargetEntry with some flag set?) that suppresses this processing. I'm just shooting from the hip here; that might be wrong in detail, or even in broad strokes, but it just looks to me like the additional RTE kind is going to bleed into a lot of places. This patch also has significant style issues. Conforming to PostgreSQL coding style is essential; if neither the author nor the reviewer fixes problems in this area, then that is essentially making it the committer's job, and the committer may not feel like taking time to do that. Here's a selection of issues that I noticed while reading this through: we use spaces around operators; the patch adds two blank lines that shouldn't be there to the middle of the variable declarations section; variables should be declared in the innermost possible scope; single-statement blocks shouldn't have curly braces; there shouldn't be whitespace before a closing parenthesis; there should be a space after if and before the subsequent parenthesis; braces should be uncuddled; code that does non-obvious things isn't commented. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
W dniu 04.10.2013 00:28, Robert Haas pisze: > I wonder if we shouldn't be trying to handle resolution of these names > at an earlier processing stage, closer to the processor. Maybe it can be done in parser (in flex?) but at now it seems to be more isolated feature. > In fact, we can already get approximately the > desired effect already: > > rhaas=# update foo as after set a = before.a + 1 from foo as before > where before.a = after.a returning before.a, after.a; > a | a > ---+--- > 1 | 2 > (1 row) Compare EXPLAIN ANALYZE VERBOSE on your statement and on "patched" workflow. I can see significant difference. And your "after" returns the value after whole the work (after trigger fired) as I know (I don't know if it is needed or not, I only point at the difference). > Now this is a hack, because we don't really want to add an extra > scan/join just to get the behavior we want. But it seems to me > significant that this processing makes Vars that refer to the target > table refer to the new values, and if we got rid of it, they'd refer > to the old values. Can't we contrive to make AFTER.x parse into the > same Var node that x currently does? Then we don't need an RTE for > it. And maybe BEFORE.x ought to parse to the same node that just > plain x does but with some marking, or some other node wrapped around > it (like a TargetEntry with some flag set?) that suppresses this > processing. I'm just shooting from the hip here; that might be wrong > in detail, or even in broad strokes, but it just looks to me like the > additional RTE kind is going to bleed into a lot of places. While planning/analyzing the problem there were many ideas about hot to solve it. I was trying to avoid adding new RTE and referencing to "core" table. However it makes more and more issues. You can see some PoC on the https://github.com/davidfetter/postgresql_projects/compare/returning_before_after (other ideas I revert w/o commit because I couldn't get expected result). The other major reason was that we can avoid touching executor and/or parser's core (flex) this way. One observation: why shouldn't we use the values computed at the moment (it would be computed again if we want to do it later, in executor)? I think we can do it by modify the Var structure (add some kind of flag while generating the vars in parser?) but I'm not sure if it is good idea. The major issue is to know if the Var/TargetEntry references to the real alias "BEFORE" (named with "AS" syntax or even the real table-name - I can see there is no difference in code) or the virtual (from feature patch) "BEFORE". Doing it in parser (more "low-level") would be very awful - we'd need to check in which part of statement BEFORE/AFTER is placed (it is not allowed to use it in the other places than in "RETURNING"). We don't want to make "BEFORE" and "AFTER" restricted keywords. Now most of the code means "don't touch these because they are not real" :) If anyone has the fresh idea to it better, please write it by mail, I don't have more ideas how to solve it. > This patch also has significant style issues. I'll try to fix it soon. Regards, Karol Trzcionka
On Thu, Oct 3, 2013 at 7:54 PM, Karol Trzcionka <karlikt@gmail.com> wrote: > Compare EXPLAIN ANALYZE VERBOSE on your statement and on "patched" > workflow. I can see significant difference. And your "after" returns the > value after whole the work (after trigger fired) as I know (I don't know > if it is needed or not, I only point at the difference). Sure, I'm not saying we should implement it that way. I'm just pointing out that the ability already exists, at the executor level, to return either tuple. So I think the executor itself shouldn't need to be changed; it's just a matter of getting the correct plan tree to pop out. > While planning/analyzing the problem there were many ideas about hot to > solve it. Do you have a link to previous discussion on the mailing list? > I think we can do it by modify the Var structure (add some kind of flag > while generating the vars in parser?) but I'm not sure if it is good > idea. The major issue is to know if the Var/TargetEntry references to > the real alias "BEFORE" (named with "AS" syntax or even the real > table-name - I can see there is no difference in code) or the virtual > (from feature patch) "BEFORE". Doing it in parser (more "low-level") > would be very awful - we'd need to check in which part of statement > BEFORE/AFTER is placed (it is not allowed to use it in the other places > than in "RETURNING"). We don't want to make "BEFORE" and "AFTER" > restricted keywords. You're right, it can't happen actually in the parser. But maybe it can happen during parse analysis. I'd spend some time looking at transformColumnRef(), because that's where we translate things x.y into Var nodes. I'm not positive there's enough information available at that stage, but if p_target_rangetblentry is populated at that point, you should be able to make AFTER.x translate to a Var referencing that range table entry. It's a bit less clear how we know that we're inside the returning-list at that point; I'm not sure how much work it would be to pass that information down. But I think it's worth looking at. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
W dniu 04.10.2013 02:51, Robert Haas pisze: > Do you have a link to previous discussion on the mailing list? Sorry, most of discussion was at IRC channel. > I'm not positive there's enough information available > at that stage, but if p_target_rangetblentry is populated at that > point, you should be able to make AFTER.x translate to a Var > referencing that range table entry. It's not enough. Even if we know "where we are", there are more issues. The main question is: how should we pass information about "hello, I'm specific Var, don't evaluate me like others"? We can add two fields to Var structure (flag - normal/before/after and no. column) - however it needs to modify copyObject for Vars (at now it's done e.g. in flatten_join_alias_vars_mutator for varoattno and varnoold). If copyObject is modified, sure code in flatten_join_alias_vars_mutator/pullup_replace_vars_callback will be useless. I don't know if modifying pg at the low-level (changing structure of Var and behaviour of copyObject) is good idea. Yes if the community really want it but it needs more "votes". There is "medium" solution: changing Var structure and do the "copy" like now (in mutator and callback) but w/o the condition statement (for the new fields). I think it might need to modify more places in code because of "comparing" vars (maybe we'd need to include new fields while comparision). Regards, Karol Trzcionka
On 10/4/13 12:28 AM, Robert Haas wrote: > In fact, we can already get approximately the > desired effect already: > > rhaas=# update foo as after set a = before.a + 1 from foo as before > where before.a = after.a returning before.a, after.a; > a | a > ---+--- > 1 | 2 > (1 row) > > Now this is a hack, because we don't really want to add an extra > scan/join just to get the behavior we want. But it seems to me > significant that this processing makes Vars that refer to the target > table refer to the new values, and if we got rid of it, they'd refer > to the old values. Can't we contrive to make AFTER.x parse into the > same Var node that x currently does? Then we don't need an RTE for > it. This part sounds reasonable from here. > And maybe BEFORE.x ought to parse to the same node that just > plain x does but with some marking, or some other node wrapped around > it (like a TargetEntry with some flag set?) that suppresses this > processing. I'm just shooting from the hip here; that might be wrong > in detail, or even in broad strokes, but it just looks to me like the > additional RTE kind is going to bleed into a lot of places. I might be completely in the woods here, but I believe something like this was attempted by Karol earlier, and it failed if two concurrent transactions did something similar to: UPDATE foo SET a = a + 1 RETURNING BEFORE.a; Both of them would see BEFORE.a = 0, because that's what the "a" evaluated to from the tuple we got before EvalPlanQual. But maybe what you're suggesting doesn't have this problem? Regards, Marko Tiikkaja
On 2013-10-03 20:51:08 -0400, Robert Haas wrote: > On Thu, Oct 3, 2013 at 7:54 PM, Karol Trzcionka <karlikt@gmail.com> wrote: > > Compare EXPLAIN ANALYZE VERBOSE on your statement and on "patched" > > workflow. I can see significant difference. And your "after" returns the > > value after whole the work (after trigger fired) as I know (I don't know > > if it is needed or not, I only point at the difference). > > Sure, I'm not saying we should implement it that way. I'm just > pointing out that the ability already exists, at the executor level, > to return either tuple. So I think the executor itself shouldn't need > to be changed; it's just a matter of getting the correct plan tree to > pop out. Note what pullups ExecDelete is doing to return the old tuple though... So, based on precedent special executor support is not an unlikely thing to be required for a proper implemenation. As Marko mentions, any trivial implementation not doing playing dirty like that will refer to the wrong version of the tuple. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Oct 4, 2013 at 4:42 AM, Karol Trzcionka <karlikt@gmail.com> wrote: > W dniu 04.10.2013 02:51, Robert Haas pisze: >> Do you have a link to previous discussion on the mailing list? > Sorry, most of discussion was at IRC channel. >> I'm not positive there's enough information available >> at that stage, but if p_target_rangetblentry is populated at that >> point, you should be able to make AFTER.x translate to a Var >> referencing that range table entry. > It's not enough. Even if we know "where we are", there are more issues. > The main question is: how should we pass information about "hello, I'm > specific Var, don't evaluate me like others"? My point is that AFTER.x doesn't appear to need any special marking; it means the same thing as target_table.x. BEFORE.x *does* need some kind of special marking, and I admit I'm not sure what that should look like. Maybe an RTE is OK, but letting that RTE get into the join planning machinery does not seem good; that's going to result in funky special cases all over the place. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Oct 4, 2013 at 5:04 AM, Marko Tiikkaja <marko@joh.to> wrote: > I might be completely in the woods here, but I believe something like this > was attempted by Karol earlier, and it failed if two concurrent transactions > did something similar to: > > UPDATE foo SET a = a + 1 RETURNING BEFORE.a; > > Both of them would see BEFORE.a = 0, because that's what the "a" evaluated > to from the tuple we got before EvalPlanQual. > > But maybe what you're suggesting doesn't have this problem? Hmm, it probably does. That explains why there are executor changes here; I guess they need some comments to explain their purpose. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Oct 04, 2013 at 10:42:32AM +0200, Karol Trzcionka wrote: > W dniu 04.10.2013 02:51, Robert Haas pisze: > > Do you have a link to previous discussion on the mailing list? > Sorry, most of discussion was at IRC channel. > > I'm not positive there's enough information available > > at that stage, but if p_target_rangetblentry is populated at that > > point, you should be able to make AFTER.x translate to a Var > > referencing that range table entry. > It's not enough. Even if we know "where we are", there are more issues. > The main question is: how should we pass information about "hello, I'm > specific Var, don't evaluate me like others"? We can add two fields to > Var structure (flag - normal/before/after and no. column) - however it > needs to modify copyObject for Vars (at now it's done e.g. in > flatten_join_alias_vars_mutator for varoattno and varnoold). If > copyObject is modified, sure code in > flatten_join_alias_vars_mutator/pullup_replace_vars_callback will be > useless. I don't know if modifying pg at the low-level (changing > structure of Var and behaviour of copyObject) is good idea. Yes if the > community really want it but it needs more "votes". There is "medium" > solution: changing Var structure and do the "copy" like now (in mutator > and callback) but w/o the condition statement (for the new fields). I > think it might need to modify more places in code because of "comparing" > vars (maybe we'd need to include new fields while comparision). > Regards, > Karol Trzcionka Karol, Do you plan to continue this work for the current commitfest? A lot of people really want the feature :) Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Wed, Aug 21, 2013 at 08:52:25PM +0200, Karol Trzcionka wrote: > W dniu 21.08.2013 19:17, Boszormenyi Zoltan pisze: > > With this fixed, a more complete review: > Thanks. I've done some syntactic and white space cleanup, here attached. Karol, would you care to help with commenting the sections that need same? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Вложения
On Sun, Feb 02, 2014 at 02:52:42PM -0800, David Fetter wrote: > On Wed, Aug 21, 2013 at 08:52:25PM +0200, Karol Trzcionka wrote: > > W dniu 21.08.2013 19:17, Boszormenyi Zoltan pisze: > > > With this fixed, a more complete review: > > Thanks. > > I've done some syntactic and white space cleanup, here attached. > > Karol, would you care to help with commenting the sections that need > same? Karol, Thanks for the updates :) Other folks, Next version attached. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Вложения
Hi, Some comments about the patch: * Coding Style: * multiline comments have both /* and */ on their own lines. * I think several places indent by two tabs.* Spaces around operators * ... * Many of the new comments would enjoy a bit TLC by a native speaker. * The way RTE_ALIAS creeps in many place where it doesn't seem to belong seems to indicate that the design isn't yet ready.I share Robert's suspicion that this would be better solved by referring to a special range table entry. Based on the lack of activity around this and the fact that this needs a *significant* chunk of work before being committable, I am going to mark this as returned with feedback. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres <at> anarazel.de> writes: > > Hi, > > Some comments about the patch: > * Coding Style: > * multiline comments have both /* and */ on their own lines. > * I think several places indent by two tabs. > * Spaces around operators > * ... > * Many of the new comments would enjoy a bit TLC by a native speaker. > > * The way RTE_ALIAS creeps in many place where it doesn't seem to belong > seems to indicate that the design isn't yet ready. I share Robert's > suspicion that this would be better solved by referring to a special > range table entry. > > Based on the lack of activity around this and the fact that this needs a > *significant* chunk of work before being committable, I am going to mark > this as returned with feedback. I'm actively working towards converting our software at work to use Pg instead of SQL Server and before we switch we'll need this feature to be merged. I'll do what I can to get the verbage and style whipped into shape, though I doubt I can do much with the actual code. Hopefully I can get something in soon.