Re: Inexplicable duplicate rows with unique constraint
От | Susan Hurst |
---|---|
Тема | Re: Inexplicable duplicate rows with unique constraint |
Дата | |
Msg-id | b5ba19e12fd1a4c13bd3f79dbbc31768@mail.brookhurstdata.net обсуждение исходный текст |
Ответ на | Re: Inexplicable duplicate rows with unique constraint (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-general |
That's why I created a virtual_string function to squeeze out everything but alpha characters and numbers 0-9 from any varchar or text columns that I want to use as business key columns. For example, if I have a column named job_name, I will have a companion column named v_job_name. The v_ column is to replicate Oracle's virtual column, since postgres doesn't have it. You don't put any values in the v_ column directly. I simply have a trigger on insert or update to put the value in the v_job_name column using the virtual_string(new.job_name) function. It's the v_job_name column that use in my unique constraint so that I avoid any unexpected sorting. Meanwhile, my job_name column is still human readable with whatever characters I want to see, including diacritics. Here is my function, if you want to try it out: create or replace function store.virtual_string(string_in text) returns text as $body$ declare l_return text; begin l_return := regexp_replace (lower(unaccent(string_in)),'[^0-9a-z]','','g'); return l_return; end; $body$ language plpgsql volatile security definer ; Sue --- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Susan E Hurst Principal Consultant Brookhurst Data LLC Email: susan.hurst@brookhurstdata.com Mobile: 314-486-3261 On 2020-01-16 11:48, Tom Lane wrote: > Richard van der Hoff <richard@matrix.org> writes: >> On 16/01/2020 17:12, Magnus Hagander wrote: >>> See https://wiki.postgresql.org/wiki/Locale_data_changes for hints on >>> which linux distros updated when. > >> It seems like a plausible explanation but it's worth noting that all >> the >> indexed data here is (despite being in text columns), plain ascii. I'm >> surprised that a change in collation rules would change the sorting of >> such strings, and hence that it could lead to this problem. Am I >> naive? > > Unfortunately, strings containing punctuation do sort differently > after these changes, even with all-ASCII data. The example given > on that wiki page demonstrates this. > > RHEL6 (old glibc): > > $ ( echo "1-1"; echo "11" ) | LC_COLLATE=en_US.utf8 sort > 11 > 1-1 > > Fedora 30 (new glibc): > > $ ( echo "1-1"; echo "11" ) | LC_COLLATE=en_US.utf8 sort > 1-1 > 11 > > I concur with Daniel's suggestion that maybe "C" locale is > the thing to use for this data. > > regards, tom lane
В списке pgsql-general по дате отправления: