Обсуждение: BUG #10972: string_agg function incorrectly concatenating varying delimiter
BUG #10972: string_agg function incorrectly concatenating varying delimiter
От
jeff@goaldriven.com
Дата:
The following bug has been logged on the website: Bug reference: 10972 Logged by: Jeff Fischer Email address: jeff@goaldriven.com PostgreSQL version: 9.3.1 Operating system: CentOS Description: Running the query below will show how the delimiter for the current row is actually the subsequent rows delimiter. The sequence goes like this: TestBlah 1 ITTest 2 testfail 3 Yet, the concatened output is this: "testblah 2 ITtest 3 testfail" You can see that the delimiter for the current row is actually getting the subsequent rows delimiter. It's as though the string concat moves to the next row prior to concatenting the delimiter, which is likely the cause of this bug. SELECT string_agg( Field1, ' ' || RowIndex::text || ' ' ORDER BY RowIndex) as VeryLongConcatenatedResultsFieldName FROM ( SELECT 'testblah' as Field1, 1 as RowIndex UNION SELECT 'ITtest' as Field1, 2 as RowIndex UNION SELECT 'testfail' as Field1, 3 as RowIndex ) t;
jeff@goaldriven.com writes: > Running the query below will show how the delimiter for the current row is > actually the subsequent rows delimiter. Hmm, well, the documentation for string_agg doesn't say what happens when the "delimiter" argument varies across rows; but a quick look at the code finds that the first-call delimiter isn't actually used at all, and on subsequent calls the delimiter is appended to the running result before the associated value is. Which seems to me to be at least as reasonable, and certainly a great deal easier to implement, as what you seem to have in mind. Can you offer a principled argument why it should be the other way around? regards, tom lane
Re: BUG #10972: string_agg function incorrectly concatenating varying delimiter
От
Jeff Fischer
Дата:
Hi Tom, I just read the Wikipedia article on you, fun. Glad to make your acquainta= nce. My business partner, Dave Yarnall, went to Carnegie Mellon (CCed). I suppose principles can be relative, but I'll assume you mean good princip= les and give it a shot. Primarily, I'd consider whether another function uses non-deterministic row= s for its evaluation. I could be wrong, but I don't think any other functi= on uses two different rows results within a single function evaluation. Ev= en aggregates, such as string_agg, evaluate one row at a time which is a we= ll-known behavior. A similar paradigm might be in general programming if a compiled program ra= ndomly chose values off of the stack to place as a parameter into a method = call (function 3's parameters are passed into function2). An odd and unexp= ected behavior for the SQL language and really any language, I think. Alth= ough, it is quite creative. It sounds like you've quickly isolated the line within the source. In an i= nterest in learning more about the code, would you mind pointing my partner= and I to the line for this bug? Thanks, Jeff -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us]=20 Sent: Wednesday, July 16, 2014 7:20 AM To: Jeff Fischer Cc: pgsql-bugs@postgresql.org Subject: Re: [BUGS] BUG #10972: string_agg function incorrectly concatenati= ng varying delimiter jeff@goaldriven.com writes: > Running the query below will show how the delimiter for the current=20 > row is actually the subsequent rows delimiter. Hmm, well, the documentation for string_agg doesn't say what happens when t= he "delimiter" argument varies across rows; but a quick look at the code fi= nds that the first-call delimiter isn't actually used at all, and on subseq= uent calls the delimiter is appended to the running result before the assoc= iated value is. Which seems to me to be at least as reasonable, and certai= nly a great deal easier to implement, as what you seem to have in mind. Ca= n you offer a principled argument why it should be the other way around? regards, tom lane
Re: BUG #10972: string_agg function incorrectly concatenating varying delimiter
От
Pavel Stehule
Дата:
Hello 2014-07-21 22:34 GMT+02:00 Jeff Fischer <jeff@goaldriven.com>: > Hi Tom, > > I just read the Wikipedia article on you, fun. Glad to make your > acquaintance. My business partner, Dave Yarnall, went to Carnegie Mellon > (CCed). > > I suppose principles can be relative, but I'll assume you mean good > principles and give it a shot. > > Primarily, I'd consider whether another function uses non-deterministic > rows for its evaluation. I could be wrong, but I don't think any other > function uses two different rows results within a single function > evaluation. Even aggregates, such as string_agg, evaluate one row at a > time which is a well-known behavior. > > A similar paradigm might be in general programming if a compiled program > randomly chose values off of the stack to place as a parameter into a > method call (function 3's parameters are passed into function2). An odd > and unexpected behavior for the SQL language and really any language, I > think. Although, it is quite creative. > > It sounds like you've quickly isolated the line within the source. In an > interest in learning more about the code, would you mind pointing my > partner and I to the line for this bug? > https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/varlena.c search string_agg Regards Pavel > > Thanks, > Jeff > > -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: Wednesday, July 16, 2014 7:20 AM > To: Jeff Fischer > Cc: pgsql-bugs@postgresql.org > Subject: Re: [BUGS] BUG #10972: string_agg function incorrectly > concatenating varying delimiter > > jeff@goaldriven.com writes: > > Running the query below will show how the delimiter for the current > > row is actually the subsequent rows delimiter. > > Hmm, well, the documentation for string_agg doesn't say what happens when > the "delimiter" argument varies across rows; but a quick look at the code > finds that the first-call delimiter isn't actually used at all, and on > subsequent calls the delimiter is appended to the running result before the > associated value is. Which seems to me to be at least as reasonable, and > certainly a great deal easier to implement, as what you seem to have in > mind. Can you offer a principled argument why it should be the other way > around? > > regards, tom lane > > > -- > Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-bugs >
Re: BUG #10972: string_agg function incorrectly concatenating varying delimiter
От
Jeff Fischer
Дата:
VGhhbmsgeW91IHZlcnkgbXVjaCBQYXZlbC4NCg0KVGhhbmtzLA0KSmVmZg0KDQpGcm9tOiBQYXZl bCBTdGVodWxlIFttYWlsdG86cGF2ZWwuc3RlaHVsZUBnbWFpbC5jb21dDQpTZW50OiBNb25kYXks IEp1bHkgMjEsIDIwMTQgMTA6NDIgUE0NClRvOiBKZWZmIEZpc2NoZXINCkNjOiBUb20gTGFuZTsg cGdzcWwtYnVnc0Bwb3N0Z3Jlc3FsLm9yZzsgRGF2ZSBZYXJuYWxsDQpTdWJqZWN0OiBSZTogW0JV R1NdIEJVRyAjMTA5NzI6IHN0cmluZ19hZ2cgZnVuY3Rpb24gaW5jb3JyZWN0bHkgY29uY2F0ZW5h dGluZyB2YXJ5aW5nIGRlbGltaXRlcg0KDQpIZWxsbw0KDQoyMDE0LTA3LTIxIDIyOjM0IEdNVCsw MjowMCBKZWZmIEZpc2NoZXIgPGplZmZAZ29hbGRyaXZlbi5jb208bWFpbHRvOmplZmZAZ29hbGRy aXZlbi5jb20+PjoNCkhpIFRvbSwNCg0KSSBqdXN0IHJlYWQgdGhlIFdpa2lwZWRpYSBhcnRpY2xl IG9uIHlvdSwgZnVuLiAgR2xhZCB0byBtYWtlIHlvdXIgYWNxdWFpbnRhbmNlLiAgTXkgYnVzaW5l c3MgcGFydG5lciwgRGF2ZSBZYXJuYWxsLCB3ZW50IHRvIENhcm5lZ2llIE1lbGxvbiAoQ0NlZCku DQoNCkkgc3VwcG9zZSBwcmluY2lwbGVzIGNhbiBiZSByZWxhdGl2ZSwgYnV0IEknbGwgYXNzdW1l IHlvdSBtZWFuIGdvb2QgcHJpbmNpcGxlcyBhbmQgZ2l2ZSBpdCBhIHNob3QuDQoNClByaW1hcmls eSwgSSdkIGNvbnNpZGVyIHdoZXRoZXIgYW5vdGhlciBmdW5jdGlvbiB1c2VzIG5vbi1kZXRlcm1p bmlzdGljIHJvd3MgZm9yIGl0cyBldmFsdWF0aW9uLiAgSSBjb3VsZCBiZSB3cm9uZywgYnV0IEkg ZG9uJ3QgdGhpbmsgYW55IG90aGVyIGZ1bmN0aW9uIHVzZXMgdHdvIGRpZmZlcmVudCByb3dzIHJl c3VsdHMgd2l0aGluIGEgc2luZ2xlIGZ1bmN0aW9uIGV2YWx1YXRpb24uICBFdmVuIGFnZ3JlZ2F0 ZXMsIHN1Y2ggYXMgc3RyaW5nX2FnZywgZXZhbHVhdGUgb25lIHJvdyBhdCBhIHRpbWUgd2hpY2gg aXMgYSB3ZWxsLWtub3duIGJlaGF2aW9yLg0KDQpBIHNpbWlsYXIgcGFyYWRpZ20gbWlnaHQgYmUg aW4gZ2VuZXJhbCBwcm9ncmFtbWluZyBpZiBhIGNvbXBpbGVkIHByb2dyYW0gcmFuZG9tbHkgY2hv c2UgdmFsdWVzIG9mZiBvZiB0aGUgc3RhY2sgdG8gcGxhY2UgYXMgYSBwYXJhbWV0ZXIgaW50byBh IG1ldGhvZCBjYWxsIChmdW5jdGlvbiAzJ3MgcGFyYW1ldGVycyBhcmUgcGFzc2VkIGludG8gZnVu Y3Rpb24yKS4gIEFuIG9kZCBhbmQgdW5leHBlY3RlZCBiZWhhdmlvciBmb3IgdGhlIFNRTCBsYW5n dWFnZSBhbmQgcmVhbGx5IGFueSBsYW5ndWFnZSwgSSB0aGluay4gIEFsdGhvdWdoLCBpdCBpcyBx dWl0ZSBjcmVhdGl2ZS4NCg0KSXQgc291bmRzIGxpa2UgeW91J3ZlIHF1aWNrbHkgaXNvbGF0ZWQg dGhlIGxpbmUgd2l0aGluIHRoZSBzb3VyY2UuICBJbiBhbiBpbnRlcmVzdCBpbiBsZWFybmluZyBt b3JlIGFib3V0IHRoZSBjb2RlLCB3b3VsZCB5b3UgbWluZCBwb2ludGluZyBteSBwYXJ0bmVyIGFu ZCBJIHRvIHRoZSBsaW5lIGZvciB0aGlzIGJ1Zz8NCg0KaHR0cHM6Ly9naXRodWIuY29tL3Bvc3Rn cmVzL3Bvc3RncmVzL2Jsb2IvbWFzdGVyL3NyYy9iYWNrZW5kL3V0aWxzL2FkdC92YXJsZW5hLmMN CnNlYXJjaCBzdHJpbmdfYWdnDQoNClJlZ2FyZHMNClBhdmVsDQoNCg0KVGhhbmtzLA0KSmVmZg0K DQotLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KRnJvbTogVG9tIExhbmUgW21haWx0bzp0Z2xA c3NzLnBnaC5wYS51czxtYWlsdG86dGdsQHNzcy5wZ2gucGEudXM+XQ0KU2VudDogV2VkbmVzZGF5 LCBKdWx5IDE2LCAyMDE0IDc6MjAgQU0NClRvOiBKZWZmIEZpc2NoZXINCkNjOiBwZ3NxbC1idWdz QHBvc3RncmVzcWwub3JnPG1haWx0bzpwZ3NxbC1idWdzQHBvc3RncmVzcWwub3JnPg0KU3ViamVj dDogUmU6IFtCVUdTXSBCVUcgIzEwOTcyOiBzdHJpbmdfYWdnIGZ1bmN0aW9uIGluY29ycmVjdGx5 IGNvbmNhdGVuYXRpbmcgdmFyeWluZyBkZWxpbWl0ZXINCg0KamVmZkBnb2FsZHJpdmVuLmNvbTxt YWlsdG86amVmZkBnb2FsZHJpdmVuLmNvbT4gd3JpdGVzOg0KPiBSdW5uaW5nIHRoZSBxdWVyeSBi ZWxvdyB3aWxsIHNob3cgaG93IHRoZSBkZWxpbWl0ZXIgZm9yIHRoZSBjdXJyZW50DQo+IHJvdyBp cyBhY3R1YWxseSB0aGUgc3Vic2VxdWVudCByb3dzIGRlbGltaXRlci4NCg0KSG1tLCB3ZWxsLCB0 aGUgZG9jdW1lbnRhdGlvbiBmb3Igc3RyaW5nX2FnZyBkb2Vzbid0IHNheSB3aGF0IGhhcHBlbnMg d2hlbiB0aGUgImRlbGltaXRlciIgYXJndW1lbnQgdmFyaWVzIGFjcm9zcyByb3dzOyBidXQgYSBx dWljayBsb29rIGF0IHRoZSBjb2RlIGZpbmRzIHRoYXQgdGhlIGZpcnN0LWNhbGwgZGVsaW1pdGVy IGlzbid0IGFjdHVhbGx5IHVzZWQgYXQgYWxsLCBhbmQgb24gc3Vic2VxdWVudCBjYWxscyB0aGUg ZGVsaW1pdGVyIGlzIGFwcGVuZGVkIHRvIHRoZSBydW5uaW5nIHJlc3VsdCBiZWZvcmUgdGhlIGFz c29jaWF0ZWQgdmFsdWUgaXMuICBXaGljaCBzZWVtcyB0byBtZSB0byBiZSBhdCBsZWFzdCBhcyBy ZWFzb25hYmxlLCBhbmQgY2VydGFpbmx5IGEgZ3JlYXQgZGVhbCBlYXNpZXIgdG8gaW1wbGVtZW50 LCBhcyB3aGF0IHlvdSBzZWVtIHRvIGhhdmUgaW4gbWluZC4gIENhbiB5b3Ugb2ZmZXIgYSBwcmlu Y2lwbGVkIGFyZ3VtZW50IHdoeSBpdCBzaG91bGQgYmUgdGhlIG90aGVyIHdheSBhcm91bmQ/DQoN CiAgICAgICAgICAgICAgICAgICAgICAgIHJlZ2FyZHMsIHRvbSBsYW5lDQoNCg0KLS0NClNlbnQg dmlhIHBnc3FsLWJ1Z3MgbWFpbGluZyBsaXN0IChwZ3NxbC1idWdzQHBvc3RncmVzcWwub3JnPG1h aWx0bzpwZ3NxbC1idWdzQHBvc3RncmVzcWwub3JnPikNClRvIG1ha2UgY2hhbmdlcyB0byB5b3Vy IHN1YnNjcmlwdGlvbjoNCmh0dHA6Ly93d3cucG9zdGdyZXNxbC5vcmcvbWFpbHByZWYvcGdzcWwt YnVncw0KDQo=
Re: BUG #10972: string_agg function incorrectly concatenating varying delimiter
От
Marti Raudsepp
Дата:
On Wed, Jul 16, 2014 at 3:16 AM, <jeff@goaldriven.com> wrote: > Running the query below will show how the delimiter for the current row is > actually the subsequent rows delimiter. The sequence goes like this: > > TestBlah 1 > ITTest 2 > testfail 3 > > Yet, the concatened output is this: > > "testblah 2 ITtest 3 testfail" I think you're misunderstanding the point of the 2nd argument. It's not there for just concatenating two arguments together, it's there to be a delimiter between the strings being concatenated. Almost always it should be a constant. For example if you want a result like "TestBlah,ITTest,testfail", that would be much uglier to do without the delimiter argument. The behavior you want is available as simply: string_agg(Field1 || ' ' || RowIndex::text, ' ') Regards, Marti