Re: Inexplicable duplicate rows with unique constraint

Поиск

Список

Период

Сортировка

От	Susan Hurst
Тема	Re: Inexplicable duplicate rows with unique constraint
Дата	16 января 2020 г. 18:12:15
Msg-id	b5ba19e12fd1a4c13bd3f79dbbc31768@mail.brookhurstdata.net обсуждение исходный текст
Ответ на	Re: Inexplicable duplicate rows with unique constraint (Tom Lane <tgl@sss.pgh.pa.us>)
Список	pgsql-general

Дерево обсуждения

That's why I created a virtual_string function to squeeze out everything 
but alpha characters and numbers 0-9 from any varchar or text columns 
that I want to use as business key columns.  For example, if I have a 
column named job_name, I will have a companion column named v_job_name.  
The v_ column is to replicate Oracle's virtual column, since postgres 
doesn't have it.  You don't put any values in the v_ column directly. I 
simply have a trigger on insert or update to put the value in the 
v_job_name column using the virtual_string(new.job_name) function.  It's 
the v_job_name column that use in my unique constraint so that I avoid 
any unexpected sorting.  Meanwhile, my job_name column is still human 
readable with whatever characters I want to see, including diacritics.

Here is my function, if you want to try it out:

create or replace function store.virtual_string(string_in text)
returns text as
$body$
    declare
        l_return text;
    begin
        l_return := regexp_replace 
(lower(unaccent(string_in)),'[^0-9a-z]','','g');
        return l_return;
    end;
$body$
   language plpgsql volatile security definer
;

Sue

---
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Susan E Hurst
Principal Consultant
Brookhurst Data LLC
Email: susan.hurst@brookhurstdata.com
Mobile: 314-486-3261

On 2020-01-16 11:48, Tom Lane wrote:
> Richard van der Hoff <richard@matrix.org> writes:
>> On 16/01/2020 17:12, Magnus Hagander wrote:
>>> See https://wiki.postgresql.org/wiki/Locale_data_changes for hints on
>>> which linux distros updated when.
> 
>> It seems like a plausible explanation but it's worth noting that all 
>> the
>> indexed data here is (despite being in text columns), plain ascii. I'm
>> surprised that a change in collation rules would change the sorting of
>> such strings, and hence that it could lead to this problem. Am I 
>> naive?
> 
> Unfortunately, strings containing punctuation do sort differently
> after these changes, even with all-ASCII data.  The example given
> on that wiki page demonstrates this.
> 
> RHEL6 (old glibc):
> 
> $ ( echo "1-1"; echo "11" ) | LC_COLLATE=en_US.utf8 sort
> 11
> 1-1
> 
> Fedora 30 (new glibc):
> 
> $ ( echo "1-1"; echo "11" ) | LC_COLLATE=en_US.utf8 sort
> 1-1
> 11
> 
> I concur with Daniel's suggestion that maybe "C" locale is
> the thing to use for this data.
> 
>             regards, tom lane

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Inexplicable duplicate rows with unique constraint