Re: MD5 aggregate
От | Dean Rasheed |
---|---|
Тема | Re: MD5 aggregate |
Дата | |
Msg-id | CAEZATCWZcij6dpv=q9KjxqpPtdh9pbwDebqd=u5YboAMjZyrDA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: MD5 aggregate (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: MD5 aggregate
|
Список | pgsql-hackers |
On 14 June 2013 14:14, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Marko Kreen <markokr@gmail.com> writes: >> On Thu, Jun 13, 2013 at 12:35 PM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >>> Attached is a patch implementing a new aggregate function md5_agg() to >>> compute the aggregate MD5 sum across a number of rows. > >> It's more efficient to calculate per-row md5, and then sum() them. >> This avoids the need for ORDER BY. > > Good point. The aggregate md5 function also fails to distinguish the > case where we have 'xyzzy' followed by 'xyz' in two adjacent rows > from the case where they contain 'xyz' followed by 'zyxyz'. > Well, if you aggregated foo.*::text as in my original example, then the textual representation of the row would protect you from that. But yes, if you were just doing it with a single text column that might be a risk. > Now, as against that, you lose any sensitivity to the ordering of the > values. > > Personally I'd be a bit inclined to xor the per-row md5's rather than > sum them, but that's a small matter. > But this would be a much riskier thing to do with a single column, because if you updated multiple rows in the same way (e.g., UPDATE t SET x='foo' WHERE x='bar') then xor'ing the md5's would cancel out if there were an even number of matches. Regards, Dean
В списке pgsql-hackers по дате отправления: