Обсуждение: Issues with LC_COLLATE, across 8.3/8.4 on various platforms

Поиск
Список
Период
Сортировка

Issues with LC_COLLATE, across 8.3/8.4 on various platforms

От
Achilleas Mantzios
Дата:
Hi, i have the following problem, text ordering seems to behave incosistently across various lc_collate values, OS'es,
PostgreSQLversions. 
Some behaviour might be expected, some not, thats why i am asking to see where i stand with this.
Test Data
postgres@dynacom=# SELECT * from test_sort_order;
         fooname
-------------------------
 Cylinder head cover No1
 Cylinder Liner No1
 Cylinder head No1
(3 rows)

Now the query
# SELECT * from test_sort_order order by fooname;
in PostgreSQL 8.3.7 on x86_64-unknown-linux-gnu  (lc_collate=en_US.UTF-8) gives
    fooname
-------------------------
 Cylinder head cover No1
 Cylinder head No1
 Cylinder Liner No1

while in all of
PostgreSQL 8.3.3 on i686-pc-linux-gnu (lc_collate=C)
PostgreSQL 8.3.7 (lc_collate=el_GR.UTF-8) on i386-unknown-freebsd6.3
PostgreSQL 8.4.1 on x86_64-unknown-freebsd8.0 (lc_collate=en_US.UTF-8), gives
    fooname
-------------------------
 Cylinder Liner No1
 Cylinder head No1
 Cylinder head cover No1

Database encoding is SQL_ASCII in all four cases.

--
Achilleas Mantzios

Re: Issues with LC_COLLATE, across 8.3/8.4 on various platforms

От
Achilleas Mantzios
Дата:
It seems that libc's behaviour regarding LC_COLLATE in Linux/FreeBSD are two different stories, hence the seen results.
(Switching LC_COLLATE between en_US.UTF-8 and POSIX, in FreeBSD it does not have any impact on orderring while in Linux
does)
I think i have resolved the issue, so the solution boils down to setting LC_COLLATE=C in our linux central production
postgresqlserver. 
What is the shortest way to change the default locale on a production installation running PostgreSQL-8.3.9?
Is there anything less painful than dump, initdb,restore?
I know 8.4 allows for per-db locale settings, i am just asking if there is a way to solve the issue now without waiting
till
the migration to 8.4

Thanx!

Στις Tuesday 02 February 2010 14:26:44 ο/η Achilleas Mantzios έγραψε:
> Hi, i have the following problem, text ordering seems to behave incosistently across various lc_collate values,
OS'es,PostgreSQL versions. 
> Some behaviour might be expected, some not, thats why i am asking to see where i stand with this.
> Test Data
> postgres@dynacom=# SELECT * from test_sort_order;
>          fooname
> -------------------------
>  Cylinder head cover No1
>  Cylinder Liner No1
>  Cylinder head No1
> (3 rows)
>
> Now the query
> # SELECT * from test_sort_order order by fooname;
> in PostgreSQL 8.3.7 on x86_64-unknown-linux-gnu  (lc_collate=en_US.UTF-8) gives
>     fooname
> -------------------------
>  Cylinder head cover No1
>  Cylinder head No1
>  Cylinder Liner No1
>
> while in all of
> PostgreSQL 8.3.3 on i686-pc-linux-gnu (lc_collate=C)
> PostgreSQL 8.3.7 (lc_collate=el_GR.UTF-8) on i386-unknown-freebsd6.3
> PostgreSQL 8.4.1 on x86_64-unknown-freebsd8.0 (lc_collate=en_US.UTF-8), gives
>     fooname
> -------------------------
>  Cylinder Liner No1
>  Cylinder head No1
>  Cylinder head cover No1
>
> Database encoding is SQL_ASCII in all four cases.
>
> --
> Achilleas Mantzios
>



--
Achilleas Mantzios

Re: Issues with LC_COLLATE, across 8.3/8.4 on various platforms

От
Tom Lane
Дата:
Achilleas Mantzios <achill@matrix.gatewaynet.com> writes:
> What is the shortest way to change the default locale on a production installation running PostgreSQL-8.3.9?
> Is there anything less painful than dump, initdb,restore?

No :-(

            regards, tom lane

Re: Issues with LC_COLLATE, across 8.3/8.4 on various platforms

От
Achilleas Mantzios
Дата:
Στις Tuesday 02 February 2010 17:37:05 ο/η Tom Lane έγραψε:
> Achilleas Mantzios <achill@matrix.gatewaynet.com> writes:
> > What is the shortest way to change the default locale on a production installation running PostgreSQL-8.3.9?
> > Is there anything less painful than dump, initdb,restore?
>
> No :-(
>

Thanx Tom,
taking into account that the large and most important DB in our installation (over 100Gb as SQL dump) lives in a
tablespaceof its own, 
is there a way to "cheat" postgresql after initdb in order to see this tablespace and the database in it, avoiding the
wholecycle? 
i think this is not a good idea, i just thought i could ask, just to make sure.

>             regards, tom lane
>



--
Achilleas Mantzios

Re: Issues with LC_COLLATE, across 8.3/8.4 on various platforms

От
Peter Eisentraut
Дата:
On tis, 2010-02-02 at 17:28 +0200, Achilleas Mantzios wrote:
> It seems that libc's behaviour regarding LC_COLLATE in Linux/FreeBSD
> are two different stories, hence the seen results.

The UTF-8 locales on FreeBSD and Mac OS X are pretty much broken.

This is becoming a FAQ ...


Re: Issues with LC_COLLATE, across 8.3/8.4 on various platforms

От
Achilleas Mantzios
Дата:
Στις Wednesday 03 February 2010 16:53:20 ο/η Peter Eisentraut έγραψε:
> On tis, 2010-02-02 at 17:28 +0200, Achilleas Mantzios wrote:
> > It seems that libc's behaviour regarding LC_COLLATE in Linux/FreeBSD
> > are two different stories, hence the seen results.
>
> The UTF-8 locales on FreeBSD and Mac OS X are pretty much broken.
>
> This is becoming a FAQ ...

This is a libc/locale combination issue. Care to elaborate?
AFAICT for my case in FreeBSD, the greek UTF locale el_GR.UTF-8 behaves as expected without problems.

>
>



--
Achilleas Mantzios

Re: Issues with LC_COLLATE, across 8.3/8.4 on various platforms

От
Peter Eisentraut
Дата:
On ons, 2010-02-03 at 17:05 +0200, Achilleas Mantzios wrote:
> Στις Wednesday 03 February 2010 16:53:20 ο/η Peter Eisentraut έγραψε:
> > On tis, 2010-02-02 at 17:28 +0200, Achilleas Mantzios wrote:
> > > It seems that libc's behaviour regarding LC_COLLATE in Linux/FreeBSD
> > > are two different stories, hence the seen results.
> >
> > The UTF-8 locales on FreeBSD and Mac OS X are pretty much broken.
> >
> > This is becoming a FAQ ...
>
> This is a libc/locale combination issue. Care to elaborate?
> AFAICT for my case in FreeBSD, the greek UTF locale el_GR.UTF-8 behaves as expected without problems.

Well,

"""
while in all of
PostgreSQL 8.3.3 on i686-pc-linux-gnu (lc_collate=C)
PostgreSQL 8.3.7 (lc_collate=el_GR.UTF-8) on i386-unknown-freebsd6.3
PostgreSQL 8.4.1 on x86_64-unknown-freebsd8.0 (lc_collate=en_US.UTF-8),
gives
        fooname
-------------------------
 Cylinder Liner No1
 Cylinder head No1
 Cylinder head cover No1
"""

is not a correct sorting result in my book, unless your measure of
correctness is ASCII byte value order.  And then it's left as an
exercise to determine what the results would be for texts involving
non-ASCII letters.



Re: Issues with LC_COLLATE, across 8.3/8.4 on various platforms

От
Achilleas Mantzios
Дата:
Στις Wednesday 03 February 2010 21:16:36 ο/η Peter Eisentraut έγραψε:
> On ons, 2010-02-03 at 17:05 +0200, Achilleas Mantzios wrote:
> > Στις Wednesday 03 February 2010 16:53:20 ο/η Peter Eisentraut έγραψε:
> > > On tis, 2010-02-02 at 17:28 +0200, Achilleas Mantzios wrote:
> > > > It seems that libc's behaviour regarding LC_COLLATE in Linux/FreeBSD
> > > > are two different stories, hence the seen results.
> > >
> > > The UTF-8 locales on FreeBSD and Mac OS X are pretty much broken.
> > >
> > > This is becoming a FAQ ...
^^^^
I guess you are right.  In my new FreeBSD system i get:
postgres@smadevnew:~> uname -sr
FreeBSD 8.0-RELEASE-p2
postgres@smadevnew:~> ls -l /usr/share/locale/en_US.UTF-8/
total 0
lrwxr-xr-x  1 root  wheel  28 Nov 21 16:30 LC_COLLATE -> ../la_LN.US-ASCII/LC_COLLATE
lrwxr-xr-x  1 root  wheel  17 Nov 21 16:30 LC_CTYPE -> ../UTF-8/LC_CTYPE
lrwxr-xr-x  1 root  wheel  30 Nov 21 16:30 LC_MESSAGES -> ../en_US.ISO8859-1/LC_MESSAGES
lrwxr-xr-x  1 root  wheel  30 Nov 21 16:30 LC_MONETARY -> ../en_US.ISO8859-1/LC_MONETARY
lrwxr-xr-x  1 root  wheel  29 Nov 21 16:30 LC_NUMERIC -> ../en_US.ISO8859-1/LC_NUMERIC
lrwxr-xr-x  1 root  wheel  26 Nov 21 16:30 LC_TIME -> ../en_US.ISO8859-1/LC_TIME

>
>



--
Achilleas Mantzios