Обсуждение: BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

Поиск
Список
Период
Сортировка

BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      18196
Logged by:          Halil Han Badem
Email address:      halilhanbadem@gmail.com
PostgreSQL version: 15.4
Operating system:   Windows 11 22H2 22621.2715
Description:

Hello,

This morning, I noticed that my computer at the office and my computer at
home had received automatic updates. Subsequently, I received reports of
some errors. Upon investigation, I found that the PostgreSQL service was not
running, and the error indicated that it was due to a problem in the
postgresql.conf file. After extensive research, I discovered that in the
Windows 11 22H2 22621.2715 update, the term "Turkey" had been changed to
"Türkiye." As a result, Turkish_Turkey.1254 is no longer present in the
operating system. Instead of Turkish_Turkey.1254, the Windows system now
uses Turkish_Türkiye.1254.

Databases created with Turkish_Turkey.1254 cannot find the corresponding
locale, resulting in errors and service disruptions. After a fresh
installation, I observed the following expressions in the postgresql.conf
file.

# These settings are initialized by initdb, but they can be changed.
lc_messages = 'Turkish_Türkiye.1254'            # locale for system error message
                    # strings
lc_monetary = 'Turkish_Türkiye.1254'            # locale for monetary formatting
lc_numeric = 'Turkish_Türkiye.1254'            # locale for number formatting
lc_time = 'Turkish_Türkiye.1254'                # locale for time formatting

Therefore, there is a conflict, and I request your assistance in addressing
this matter.

Thank you,
Best regards.


Re: BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

От
Laurenz Albe
Дата:
On Wed, 2023-11-15 at 09:04 +0000, PG Bug reporting form wrote:
> Operating system:   Windows 11 22H2 22621.2715
>
> This morning, I noticed that my computer at the office and my computer at
> home had received automatic updates. Subsequently, I received reports of
> some errors. Upon investigation, I found that the PostgreSQL service was not
> running, and the error indicated that it was due to a problem in the
> postgresql.conf file. After extensive research, I discovered that in the
> Windows 11 22H2 22621.2715 update, the term "Turkey" had been changed to
> "Türkiye." As a result, Turkish_Turkey.1254 is no longer present in the
> operating system. Instead of Turkish_Turkey.1254, the Windows system now
> uses Turkish_Türkiye.1254.
>
> Databases created with Turkish_Turkey.1254 cannot find the corresponding
> locale, resulting in errors and service disruptions. After a fresh
> installation, I observed the following expressions in the postgresql.conf
> file.
>
> # These settings are initialized by initdb, but they can be changed.
> lc_messages = 'Turkish_Türkiye.1254'            # locale for system error message
>                     # strings
> lc_monetary = 'Turkish_Türkiye.1254'            # locale for monetary formatting
> lc_numeric = 'Turkish_Türkiye.1254'            # locale for number formatting
> lc_time = 'Turkish_Türkiye.1254'                # locale for time formatting
>
> Therefore, there is a conflict, and I request your assistance in addressing
> this matter.

This has been reported before [1], but your report explains the problem clearly.

Can you show the exact error message?

This seems to be a really nasty problem that Microsoft has created for us.
Even if the server started, you could not connect to the database if the
collation persisted in pg_database is no longer accepted by the operating
system.

I cannot think of a possible solution except a hard-coded hack specifically
for Turkish locale names...

Yours,
Laurenz Albe


 [1]: https://postgr.es/m/18181-fa5143d18a467fe5%40postgresql.org



Re: BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

От
Halil Han Badem
Дата:
I can convey it to you in more detail. Now I will explain the steps to avoid repeating this error.

1. Obtain a Windows version prior to version 22H2 22621.2715.
2. During the PostgreSQL installation in that version, choose Turkish, Turkey as the locale (during installation).
3. Create a database using PgAdmin or any other tool and select Turkish_Turkey.1254 as the local option.
4. Update your operating system to version 22H2 22621.2715.
5. After updating the operating system and restarting, you will notice that the PostgreSQL service is not running.
6. When you try to start the service, you will encounter an error stating that the Windows service cannot be started.
7. Later, in the Windows Event Logs, you will see the error "2023-11-15 10:47:14.877 +03 [1644] FATAL: configuration file "C:/Program Files/PostgreSQL/15/data/postgresql.conf" contains errors."
8. The cause of the error in this file is the values specified as lc_messages, lc_monetary, lc_numeric, lc_time under Turkish_Turkey.1254. This is because, with the new update, Turkish_Turkey.1254 is no longer present in the Windows operating system; the correct locale name is now Turkish_Türkiye.1254.

In this case, the solution I implemented was as follows:
1. I copied the existing "data" folder of the database.
2. I obtained a Windows version with a lower version number than 22H2 22621.2715.
3. I defined the "data" folder for the PostgreSQL server in the relevant version and restarted the services.
4. I accessed the database, took a backup.
5. I reinstalled PostgreSQL on the problematic server.
6. I restored the backup to the server. Before restoring, I marked the local option for the created database as "C". This option worked smoothly and did not cause any data loss.

Note: After updating Windows, if you reinstall PostgreSQL, you will see that all Turkish_Turkey.1254 values in the postgresql.conf file have changed to Turkish_Türkiye.1254. If you manipulate the value in this file (try changing it back to Turkish_Turkey.1254), you will encounter the error `The database was initialized with LC_COLLATE “Turkish_Turkey.1254”, which is not recognized by setlocale()` because Turkish_Turkey.1254 locale no longer exists; its name has been changed.

The decision of Windows to update in this regard will be a bad experience for users in Turkey or anyone using the Turkish language option. I hope the instant solution I found is helpful. I sincerely look forward to any permanent solutions you may find.

Best regards.



Halil Han BADEM
 Yazılım & Sistem Uzmanı
linkedin iconinstagram icon
E-Posta: halilhanbadem@gmail.com | work@halilhanbadem.dev
VD. - VNo.: 1290500711 - HİTİT | Telefon: (850) 309 4875
Balgat Mahallesi, Ceyhun Atuf Kansu Caddesi 36/6 06520, Çankaya, Ankara


Laurenz Albe <laurenz.albe@cybertec.at>, 15 Kas 2023 Çar, 20:48 tarihinde şunu yazdı:
On Wed, 2023-11-15 at 09:04 +0000, PG Bug reporting form wrote:
> Operating system:   Windows 11 22H2 22621.2715
>
> This morning, I noticed that my computer at the office and my computer at
> home had received automatic updates. Subsequently, I received reports of
> some errors. Upon investigation, I found that the PostgreSQL service was not
> running, and the error indicated that it was due to a problem in the
> postgresql.conf file. After extensive research, I discovered that in the
> Windows 11 22H2 22621.2715 update, the term "Turkey" had been changed to
> "Türkiye." As a result, Turkish_Turkey.1254 is no longer present in the
> operating system. Instead of Turkish_Turkey.1254, the Windows system now
> uses Turkish_Türkiye.1254.
>
> Databases created with Turkish_Turkey.1254 cannot find the corresponding
> locale, resulting in errors and service disruptions. After a fresh
> installation, I observed the following expressions in the postgresql.conf
> file.
>
> # These settings are initialized by initdb, but they can be changed.
> lc_messages = 'Turkish_Türkiye.1254'                  # locale for system error message
>                                       # strings
> lc_monetary = 'Turkish_Türkiye.1254'                  # locale for monetary formatting
> lc_numeric = 'Turkish_Türkiye.1254'                   # locale for number formatting
> lc_time = 'Turkish_Türkiye.1254'                              # locale for time formatting
>
> Therefore, there is a conflict, and I request your assistance in addressing
> this matter.

This has been reported before [1], but your report explains the problem clearly.

Can you show the exact error message?

This seems to be a really nasty problem that Microsoft has created for us.
Even if the server started, you could not connect to the database if the
collation persisted in pg_database is no longer accepted by the operating
system.

I cannot think of a possible solution except a hard-coded hack specifically
for Turkish locale names...

Yours,
Laurenz Albe


 [1]: https://postgr.es/m/18181-fa5143d18a467fe5%40postgresql.org
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> On Wed, 2023-11-15 at 09:04 +0000, PG Bug reporting form wrote:
>> [ Microsoft decided to rename their Turkish locales ]

> This seems to be a really nasty problem that Microsoft has created for us.
> Even if the server started, you could not connect to the database if the
> collation persisted in pg_database is no longer accepted by the operating
> system.

Yeah, that's a mess.  The postgresql.conf entries could be fixed by hand,
but if you've got "Turkish_Turkey.1254" in pg_database.datcollate or
datctype then there's no easy way around that.

I think really the right fix is a whole bunch of WTFs directed at
Microsoft.  I can understand wanting to support the Turkish_Türkiye
spelling, but not having a backward-compatibility entry for the old
spelling is a colossal blunder -- especially in a minor OS release.
Aren't these the same people who prize backward compatibility at
any cost?

            regards, tom lane



Re: BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

От
Laurenz Albe
Дата:
On Wed, 2023-11-15 at 21:11 +0300, Halil Han Badem wrote:
> 7. Later, in the Windows Event Logs, you will see the error
>    "FATAL: configuration file "C:/Program Files/PostgreSQL/15/data/postgresql.conf" contains errors."
> 8. The cause of the error in this file is the values specified as
>    lc_messages, lc_monetary, lc_numeric, lc_time under Turkish_Turkey.1254.
>    This is because, with the new update, Turkish_Turkey.1254 is no longer
>    present in the Windows operating system; the correct locale name is
>    now Turkish_Türkiye.1254.

If you edit "postgresql.conf" and change all occurrences of "Turkish_Turkey.1254"
to "Turkish_Türkiye.1254", can you start the service?
If yes, can you connect to a Turkish database?

Yours,
Laurenz Albe



Re: BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

От
Halil Han Badem
Дата:
In this case, the service starts, but you cannot establish a connection to the database. In other words, it gives the following error. This error is expected because the database was created with Turkish_Turkey.1254. Due to the name change to Turkish_Türkiye.1254, it cannot find the local name.

failed: FATAL: database locale is incompatible with operating system DETAIL: The database was initialized with LC_COLLATE "Turkish_Turkey.1254", which is not recognized by setlocale(). HINT: Recreate the database with another locale or install the missing locale.

Regards,
Halil Han.

Laurenz Albe <laurenz.albe@cybertec.at>, 15 Kas 2023 Çar, 21:29 tarihinde şunu yazdı:
On Wed, 2023-11-15 at 21:11 +0300, Halil Han Badem wrote:
> 7. Later, in the Windows Event Logs, you will see the error
>    "FATAL: configuration file "C:/Program Files/PostgreSQL/15/data/postgresql.conf" contains errors."
> 8. The cause of the error in this file is the values specified as
>    lc_messages, lc_monetary, lc_numeric, lc_time under Turkish_Turkey.1254.
>    This is because, with the new update, Turkish_Turkey.1254 is no longer
>    present in the Windows operating system; the correct locale name is
>    now Turkish_Türkiye.1254.

If you edit "postgresql.conf" and change all occurrences of "Turkish_Turkey.1254"
to "Turkish_Türkiye.1254", can you start the service?
If yes, can you connect to a Turkish database?

Yours,
Laurenz Albe

Re: BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

От
Pavel Stehule
Дата:
Hi

st 15. 11. 2023 v 19:53 odesílatel Halil Han Badem <halilhanbadem@gmail.com> napsal:
I can convey it to you in more detail. Now I will explain the steps to avoid repeating this error.

1. Obtain a Windows version prior to version 22H2 22621.2715.
2. During the PostgreSQL installation in that version, choose Turkish, Turkey as the locale (during installation).
3. Create a database using PgAdmin or any other tool and select Turkish_Turkey.1254 as the local option.
4. Update your operating system to version 22H2 22621.2715.
5. After updating the operating system and restarting, you will notice that the PostgreSQL service is not running.
6. When you try to start the service, you will encounter an error stating that the Windows service cannot be started.
7. Later, in the Windows Event Logs, you will see the error "2023-11-15 10:47:14.877 +03 [1644] FATAL: configuration file "C:/Program Files/PostgreSQL/15/data/postgresql.conf" contains errors."
8. The cause of the error in this file is the values specified as lc_messages, lc_monetary, lc_numeric, lc_time under Turkish_Turkey.1254. This is because, with the new update, Turkish_Turkey.1254 is no longer present in the Windows operating system; the correct locale name is now Turkish_Türkiye.1254.

This is a similar issue to rename the Czech locale "Czech_Czech Republic.1250" to "Czech_Czechia.1250" six years ago.


Regards

Pavel

Re: BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

От
Halil Han Badem
Дата:
The more significant issue here is that PostgreSQL does not allow connections because, during installation, it was created with the Turkish_Turkey.1254 option. I attempted to connect to the PostgreSQL user using psql via the command line to take a backup, but it didn't even allow a connection to the main database (postgres). The error it provided is again as follows:

failed: FATAL: database locale is incompatible with the operating system DETAIL: The database was initialized with LC_COLLATE "Turkish_Turkey.1254", which is not recognized by setlocale(). HINT: Recreate the database with another locale or install the missing locale.

In this section, there are two definite solutions. The first one is the solution I explained [1], and the second one is to uninstall the Windows Update. However, removing this update will not be a permanent solution because, eventually, a Windows Update will be received for security or any dependency reasons.

[1] https://www.postgresql.org/message-id/CA%2BhfYg-rGc-O_LnS5M59HGbYufaGsQ6Sc4yBKucBQZLhoZE_7w%40mail.gmail.com

Regards.


Halil Han Badem <halilhanbadem@gmail.com>, 15 Kas 2023 Çar, 21:47 tarihinde şunu yazdı:
In this case, the service starts, but you cannot establish a connection to the database. In other words, it gives the following error. This error is expected because the database was created with Turkish_Turkey.1254. Due to the name change to Turkish_Türkiye.1254, it cannot find the local name.

failed: FATAL: database locale is incompatible with operating system DETAIL: The database was initialized with LC_COLLATE "Turkish_Turkey.1254", which is not recognized by setlocale(). HINT: Recreate the database with another locale or install the missing locale.

Regards,
Halil Han.

Laurenz Albe <laurenz.albe@cybertec.at>, 15 Kas 2023 Çar, 21:29 tarihinde şunu yazdı:
On Wed, 2023-11-15 at 21:11 +0300, Halil Han Badem wrote:
> 7. Later, in the Windows Event Logs, you will see the error
>    "FATAL: configuration file "C:/Program Files/PostgreSQL/15/data/postgresql.conf" contains errors."
> 8. The cause of the error in this file is the values specified as
>    lc_messages, lc_monetary, lc_numeric, lc_time under Turkish_Turkey.1254.
>    This is because, with the new update, Turkish_Turkey.1254 is no longer
>    present in the Windows operating system; the correct locale name is
>    now Turkish_Türkiye.1254.

If you edit "postgresql.conf" and change all occurrences of "Turkish_Turkey.1254"
to "Turkish_Türkiye.1254", can you start the service?
If yes, can you connect to a Turkish database?

Yours,
Laurenz Albe

Re: BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

От
Halil Han Badem
Дата:
Thank you Pavel. This is indeed a good solution if you have a few servers. We have users in hundreds across all major cities in Turkey, and each user is using our database on their dedicated server (we are developing a desktop application). In this case, providing support for all databases will be very challenging. We will document the solution you mentioned and add it to our documentation.

Best regards.


Pavel Stehule <pavel.stehule@gmail.com>, 15 Kas 2023 Çar, 22:03 tarihinde şunu yazdı:
Hi

st 15. 11. 2023 v 19:53 odesílatel Halil Han Badem <halilhanbadem@gmail.com> napsal:
I can convey it to you in more detail. Now I will explain the steps to avoid repeating this error.

1. Obtain a Windows version prior to version 22H2 22621.2715.
2. During the PostgreSQL installation in that version, choose Turkish, Turkey as the locale (during installation).
3. Create a database using PgAdmin or any other tool and select Turkish_Turkey.1254 as the local option.
4. Update your operating system to version 22H2 22621.2715.
5. After updating the operating system and restarting, you will notice that the PostgreSQL service is not running.
6. When you try to start the service, you will encounter an error stating that the Windows service cannot be started.
7. Later, in the Windows Event Logs, you will see the error "2023-11-15 10:47:14.877 +03 [1644] FATAL: configuration file "C:/Program Files/PostgreSQL/15/data/postgresql.conf" contains errors."
8. The cause of the error in this file is the values specified as lc_messages, lc_monetary, lc_numeric, lc_time under Turkish_Turkey.1254. This is because, with the new update, Turkish_Turkey.1254 is no longer present in the Windows operating system; the correct locale name is now Turkish_Türkiye.1254.

This is a similar issue to rename the Czech locale "Czech_Czech Republic.1250" to "Czech_Czechia.1250" six years ago.


Regards

Pavel

Re: BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

От
Pavel Stehule
Дата:
Hi

st 15. 11. 2023 v 20:10 odesílatel Halil Han Badem <halilhanbadem@gmail.com> napsal:
Thank you Pavel. This is indeed a good solution if you have a few servers. We have users in hundreds across all major cities in Turkey, and each user is using our database on their dedicated server (we are developing a desktop application). In this case, providing support for all databases will be very challenging. We will document the solution you mentioned and add it to our documentation.

maybe it can be scripted, automatized. If I remember correctly, one of my customers wrote a small application that did this.

Sure - it is a workaround - dirty, but some clients can be fixed tomorrow. 


Best regards.


Pavel Stehule <pavel.stehule@gmail.com>, 15 Kas 2023 Çar, 22:03 tarihinde şunu yazdı:
Hi

st 15. 11. 2023 v 19:53 odesílatel Halil Han Badem <halilhanbadem@gmail.com> napsal:
I can convey it to you in more detail. Now I will explain the steps to avoid repeating this error.

1. Obtain a Windows version prior to version 22H2 22621.2715.
2. During the PostgreSQL installation in that version, choose Turkish, Turkey as the locale (during installation).
3. Create a database using PgAdmin or any other tool and select Turkish_Turkey.1254 as the local option.
4. Update your operating system to version 22H2 22621.2715.
5. After updating the operating system and restarting, you will notice that the PostgreSQL service is not running.
6. When you try to start the service, you will encounter an error stating that the Windows service cannot be started.
7. Later, in the Windows Event Logs, you will see the error "2023-11-15 10:47:14.877 +03 [1644] FATAL: configuration file "C:/Program Files/PostgreSQL/15/data/postgresql.conf" contains errors."
8. The cause of the error in this file is the values specified as lc_messages, lc_monetary, lc_numeric, lc_time under Turkish_Turkey.1254. This is because, with the new update, Turkish_Turkey.1254 is no longer present in the Windows operating system; the correct locale name is now Turkish_Türkiye.1254.

This is a similar issue to rename the Czech locale "Czech_Czech Republic.1250" to "Czech_Czechia.1250" six years ago.


Regards

Pavel

Re: BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

От
Laurenz Albe
Дата:
On Wed, 2023-11-15 at 21:47 +0300, Halil Han Badem wrote:
> In this case, the service starts, but you cannot establish a connection to the database.
> In other words, it gives the following error. This error is expected because the
> database was created with Turkish_Turkey.1254. Due to the name change to
> Turkish_Türkiye.1254, it cannot find the local name.
>
> failed: FATAL: database locale is incompatible with operating system
> DETAIL: The database was initialized with LC_COLLATE "Turkish_Turkey.1254", which is not recognized by setlocale().
> HINT: Recreate the database with another locale or install the missing locale.

That was to be expected...

I cannot think of a good solution except to be Microsoft to unbreak their
software.  If you have not performed the Windows update on a system yet,
and you are bold enough, you could try the following hack:

- connect as superuser

- modify the catalog table with

    UPDATE pg_database
    SET datcollate = 'Turkish_Türkiye.1254',
        datctype = 'Turkish_Türkiye.1254'
    WHERE datcollate = 'Turkish_Turkey.1254';

- stop the PostgreSQL service

- edit "postgresql.conf" and change "Turkish_Turkey.1254" to "Turkish_Türkiye.1254" everywhere

- install the Windows update

- start the service

- in case the collation changed, REINDEX all indexes on string expressions

A safer option would be to dump the database before the upgrade
and restore the dump to a newly created cluster after the upgrade.

If you have used "Turkish_Turkey.1254" as explicit column collation
or somewhere else in your code, you will still run into problems.

This should really be fixed by Microsoft.

Yours,
Laurenz Albe



Re: BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

От
Halil Han Badem
Дата:
Thank you, Pavel. With the feature in the Locale Builder application, we can create an MSI build. With your solution, an MSI build can be created and easily configured later.
Thanks for the tip.


Pavel Stehule <pavel.stehule@gmail.com>, 15 Kas 2023 Çar, 22:19 tarihinde şunu yazdı:
Hi

st 15. 11. 2023 v 20:10 odesílatel Halil Han Badem <halilhanbadem@gmail.com> napsal:
Thank you Pavel. This is indeed a good solution if you have a few servers. We have users in hundreds across all major cities in Turkey, and each user is using our database on their dedicated server (we are developing a desktop application). In this case, providing support for all databases will be very challenging. We will document the solution you mentioned and add it to our documentation.

maybe it can be scripted, automatized. If I remember correctly, one of my customers wrote a small application that did this.

Sure - it is a workaround - dirty, but some clients can be fixed tomorrow. 


Best regards.


Pavel Stehule <pavel.stehule@gmail.com>, 15 Kas 2023 Çar, 22:03 tarihinde şunu yazdı:
Hi

st 15. 11. 2023 v 19:53 odesílatel Halil Han Badem <halilhanbadem@gmail.com> napsal:
I can convey it to you in more detail. Now I will explain the steps to avoid repeating this error.

1. Obtain a Windows version prior to version 22H2 22621.2715.
2. During the PostgreSQL installation in that version, choose Turkish, Turkey as the locale (during installation).
3. Create a database using PgAdmin or any other tool and select Turkish_Turkey.1254 as the local option.
4. Update your operating system to version 22H2 22621.2715.
5. After updating the operating system and restarting, you will notice that the PostgreSQL service is not running.
6. When you try to start the service, you will encounter an error stating that the Windows service cannot be started.
7. Later, in the Windows Event Logs, you will see the error "2023-11-15 10:47:14.877 +03 [1644] FATAL: configuration file "C:/Program Files/PostgreSQL/15/data/postgresql.conf" contains errors."
8. The cause of the error in this file is the values specified as lc_messages, lc_monetary, lc_numeric, lc_time under Turkish_Turkey.1254. This is because, with the new update, Turkish_Turkey.1254 is no longer present in the Windows operating system; the correct locale name is now Turkish_Türkiye.1254.

This is a similar issue to rename the Czech locale "Czech_Czech Republic.1250" to "Czech_Czechia.1250" six years ago.


Regards

Pavel

Re: BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

От
Halil Han Badem
Дата:
Yes, we're paying the price for Microsoft's mistake. It's really unacceptable. Thank you very much for the tips and solution suggestions. Let's discuss the situation and take action with the solutions as needed.

Thank you so much again. 

Laurenz Albe <laurenz.albe@cybertec.at>, 15 Kas 2023 Çar, 22:41 tarihinde şunu yazdı:
On Wed, 2023-11-15 at 21:47 +0300, Halil Han Badem wrote:
> In this case, the service starts, but you cannot establish a connection to the database.
> In other words, it gives the following error. This error is expected because the
> database was created with Turkish_Turkey.1254. Due to the name change to
> Turkish_Türkiye.1254, it cannot find the local name.
>
> failed: FATAL: database locale is incompatible with operating system
> DETAIL: The database was initialized with LC_COLLATE "Turkish_Turkey.1254", which is not recognized by setlocale().
> HINT: Recreate the database with another locale or install the missing locale.

That was to be expected...

I cannot think of a good solution except to be Microsoft to unbreak their
software.  If you have not performed the Windows update on a system yet,
and you are bold enough, you could try the following hack:

- connect as superuser

- modify the catalog table with

    UPDATE pg_database
    SET datcollate = 'Turkish_Türkiye.1254',
        datctype = 'Turkish_Türkiye.1254'
    WHERE datcollate = 'Turkish_Turkey.1254';

- stop the PostgreSQL service

- edit "postgresql.conf" and change "Turkish_Turkey.1254" to "Turkish_Türkiye.1254" everywhere

- install the Windows update

- start the service

- in case the collation changed, REINDEX all indexes on string expressions

A safer option would be to dump the database before the upgrade
and restore the dump to a newly created cluster after the upgrade.

If you have used "Turkish_Turkey.1254" as explicit column collation
or somewhere else in your code, you will still run into problems.

This should really be fixed by Microsoft.

Yours,
Laurenz Albe

Re: BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

От
Thomas Munro
Дата:
On Thu, Nov 16, 2023 at 7:26 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Laurenz Albe <laurenz.albe@cybertec.at> writes:
> > On Wed, 2023-11-15 at 09:04 +0000, PG Bug reporting form wrote:
> >> [ Microsoft decided to rename their Turkish locales ]
>
> > This seems to be a really nasty problem that Microsoft has created for us.
> > Even if the server started, you could not connect to the database if the
> > collation persisted in pg_database is no longer accepted by the operating
> > system.
>
> Yeah, that's a mess.  The postgresql.conf entries could be fixed by hand,
> but if you've got "Turkish_Turkey.1254" in pg_database.datcollate or
> datctype then there's no easy way around that.
>
> I think really the right fix is a whole bunch of WTFs directed at
> Microsoft.  I can understand wanting to support the Turkish_Türkiye
> spelling, but not having a backward-compatibility entry for the old
> spelling is a colossal blunder -- especially in a minor OS release.
> Aren't these the same people who prize backward compatibility at
> any cost?

For the record, the Windows manual tells us very, very clearly not to
store those "display" style names anywhere, and to use BCP 47 locale
names instead, which are based on ISO standards.  I had patches to
teach initdb to use BCP 47, and I remain convinced that we must do
that, but I have failed to attract the attention of enough people who
use and understand Windows to be able to commit anything.  In
particular, Juan José pointed out some problems that I just couldn't
understand.

https://www.postgresql.org/message-id/flat/CA%2BhUKGJ%3DXThErgAQRoqfCy1bKPxXVuF0%3D2zDbB%2BSxDs59pv7Fw%40mail.gmail.com



Re: BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

От
"Daniel Verite"
Дата:
    PG Bug reporting form wrote:

> As a result, Turkish_Turkey.1254 is no longer present in the
> operating system. Instead of Turkish_Turkey.1254, the Windows system
> now uses Turkish_Türkiye.1254.
 > Databases created with Turkish_Turkey.1254 cannot find the
> corresponding locale, resulting in errors and service disruptions

Aside from the disappearance of the previous name, this "ü" in the new
name also looks troublesome. It might be encoded as a single byte in
an encoding of the ISO-8859-* family, in which case the byte would be
0xFC it seems, but in an UTF-8 context, it would be represented by the
sequence 0xC3 0xBC.

This locale name ends up in pg_database.datcollate and datctype, and
since pg_database is shared among all databases of the instance, it's
supposed to be encoded in a way that is compatible with all possible
db encodings, which in fact implies not having characters outside
of the US-ASCII character set.

In practice, in a cluster initialized with the locale
"Turkish_Türkiye.1254" and a monobyte encoding, assuming that
an UTF-8 database can be successfully created in it , how
would "select * from pg_database" from inside that database
not fail with an UTF-8 decoding error?


Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite



"Daniel Verite" <daniel@manitou-mail.org> writes:
> Aside from the disappearance of the previous name, this "ü" in the new
> name also looks troublesome. It might be encoded as a single byte in
> an encoding of the ISO-8859-* family, in which case the byte would be
> 0xFC it seems, but in an UTF-8 context, it would be represented by the
> sequence 0xC3 0xBC.

Indeed.  That jogged my memory that we've dealt with this sort of
thing before; search the commit log for mentions of "Norwegian (Bokmål)".

I guess we could add mapping entry(s) similar to what we use for that,
although that won't help anybody who's already stuck with an unstartable
database.

            regards, tom lane



Re: BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

От
Thomas Munro
Дата:
On Thu, Nov 16, 2023 at 9:00 AM Halil Han Badem <halilhanbadem@gmail.com> wrote:
> Yes, we're paying the price for Microsoft's mistake. It's really unacceptable.

I am certainly no Windows apologist, but I think on this one, it feels
like PostgreSQL is at least partially in the wrong here.  Windows
documentation tells everyone that these names might change in every OS
update, and that we shouldn't store them anywhere, and what to do
instead ("tr-TR").  This has happened again and again over the past
decades and win32locale.c is full of bizarro hacks to deal with it.
If something good can come of this round of
"WTF-how-could-they-do-this-to-us", I hope it can be some more
eyeballs on the thread that I posted to fix this problem (see link in
my earlier message).

(Full disclosure: I work for Azure Data, a branch of Microsoft, but I
have nothing to do with the Windows operating system and don't use it,
I'm employed to work on open source stuff.  I'm speaking with my
PostgreSQL hacker hat on, here.)



Re: BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

От
Laurenz Albe
Дата:
On Thu, 2023-11-16 at 11:54 +1300, Thomas Munro wrote:
> On Thu, Nov 16, 2023 at 7:26 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Laurenz Albe <laurenz.albe@cybertec.at> writes:
> > > On Wed, 2023-11-15 at 09:04 +0000, PG Bug reporting form wrote:
> > > > [ Microsoft decided to rename their Turkish locales ]
> >
> > > This seems to be a really nasty problem that Microsoft has created for us.
> > > Even if the server started, you could not connect to the database if the
> > > collation persisted in pg_database is no longer accepted by the operating
> > > system.
> >
> > Yeah, that's a mess.  The postgresql.conf entries could be fixed by hand,
> > but if you've got "Turkish_Turkey.1254" in pg_database.datcollate or
> > datctype then there's no easy way around that.
> >
> > I think really the right fix is a whole bunch of WTFs directed at
> > Microsoft.  I can understand wanting to support the Turkish_Türkiye
> > spelling, but not having a backward-compatibility entry for the old
> > spelling is a colossal blunder -- especially in a minor OS release.
> > Aren't these the same people who prize backward compatibility at
> > any cost?
>
> For the record, the Windows manual tells us very, very clearly not to
> store those "display" style names anywhere, and to use BCP 47 locale
> names instead, which are based on ISO standards.  I had patches to
> teach initdb to use BCP 47, and I remain convinced that we must do
> that, but I have failed to attract the attention of enough people who
> use and understand Windows to be able to commit anything.  In
> particular, Juan José pointed out some problems that I just couldn't
> understand.
>
>
https://www.postgresql.org/message-id/flat/CA%2BhUKGJ%3DXThErgAQRoqfCy1bKPxXVuF0%3D2zDbB%2BSxDs59pv7Fw%40mail.gmail.com

I don't have Windows to test, but wouldn't the problem be avoided if
people created their cluster with "--locale=tr-TR"?  If yes, EDB's
Windows installer should be modified to use the correct locale names.
Is anybody from EDB reading this?

Yours,
Laurenz Albe



Re: BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

От
Thomas Munro
Дата:
On Mon, Nov 20, 2023 at 10:54 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> On Thu, 2023-11-16 at 11:54 +1300, Thomas Munro wrote:
> >
https://www.postgresql.org/message-id/flat/CA%2BhUKGJ%3DXThErgAQRoqfCy1bKPxXVuF0%3D2zDbB%2BSxDs59pv7Fw%40mail.gmail.com
>
> I don't have Windows to test, but wouldn't the problem be avoided if
> people created their cluster with "--locale=tr-TR"?  If yes, EDB's
> Windows installer should be modified to use the correct locale names.
> Is anybody from EDB reading this?

That would have the same effect as that patch.  I believe that is the
right thing to do, but I am not sure about one detail: the lack of
encoding on the end of the name.  Does the encoding remain the same as
the traditional one for that language, and stable, or when you haven't
explicitly named it, might it depend on the registry/control panel
"ACP", that could in theory change?  If it does, what happens?  Should
we put it on the end to pin it down?  When we tried that it had some
effect, but didn't seem to have the expected [by me] effect on ctype
(eg case conversion of the famous Turkish i); it appears that having
ACP != LC_TYPE gives some Frankenstein behaviour, but I don't
understand it, and I think you'd also want to determine in which cases
strcoll_l() is behaving sensibly with various combinations.  This may
all be pre-existing stuff well understood by people who worked on the
Windows port?  I don't have Windows to test.

(How nice it would be to use ICU by default!)



Re: BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

От
Laurenz Albe
Дата:
On Tue, 2023-11-21 at 09:35 +1300, Thomas Munro wrote:
> (How nice it would be to use ICU by default!)

Amen.

Yours,
Laurenz Albe



>> On Mon, Nov 20, 2023 at 10:54 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>> On Thu, 2023-11-16 at 11:54 +1300, Thomas Munro wrote:
> > https://www.postgresql.org/message-id/flat/CA%2BhUKGJ%3DXThErgAQRoqf
> > Cy1bKPxXVuF0%3D2zDbB%2BSxDs59pv7Fw%40mail.gmail.com
>
>> I don't have Windows to test, but wouldn't the problem be avoided if 
>> people created their cluster with "--locale=tr-TR"?  If yes, EDB's 
>> Windows installer should be modified to use the correct locale names.
>> Is anybody from EDB reading this?

> That would have the same effect as that patch.  I believe that is the right thing to do, but I am not sure about one
detail:the lack of encoding on the end of the name.  Does the encoding remain the same as the traditional one for that
language,and stable, or when you haven't explicitly named it, might it depend on the registry/control panel "ACP", that
couldin theory change?  If it does, what happens?  Should we put it on the end to pin it down?  When we tried that it
hadsome effect, but didn't seem to have the expected [by me] effect on ctype (eg case conversion of the famous Turkish
i);it appears that having ACP != LC_TYPE gives some Frankenstein behaviour, but I don't understand it, and I think
you'dalso want to determine in which cases
 
strcoll_l() is behaving sensibly with various combinations.  This may all be pre-existing stuff well understood by
peoplewho worked on the Windows port?  I don't have Windows to test.
 

> (How nice it would be to use ICU by default!)

My 5 cent on this:
If EDB is willing to make a 'breaking' change in their nice Windows Installer, I suggest changing the default locale to
'C'or its alias 'POSIX'.
 
At least for LC_COLLATE and LC_CTYPE.
That would be in line with the recommendation of https://www.postgresql.org/docs/current/locale.html. I believe this is
thebest setting for most use cases.
 

The breaking change would not be that hard, as with the current versions of PostgreSQL you can specify a different
collationalong CREATE DATABASE. 
 
That means minimal effort on the user side, when you really need a database with language depending collation. 
If someone want to retain the performance and stability benefit of C Collation, he can now even push the collation
requirementto CREATE INDEX of the few indexes who needs a special collation or even further into the ORDER BY clause of
theactual query (the latter would not be that performant, but probably ok in most circumstances).
 

Another Suggestion would be changing the default Data Directory to %ProgramData%\PostgreSQL\<Version>. Haven't
rechecked,but in the past the suggested DataDir was something like %ProgramFiles%\PostgreSQL\<version>. The use of
subdirectoriesfrom %ProgramFiles% for date changing on a daily basis is discouraged from Microsoft.
 

All of this does not "repair" existing databases like the solution Thomas Munro suggested upthread.

Best regards
Wilm. 

Re: BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows

От
Laurenz Albe
Дата:
On Tue, 2023-11-21 at 09:35 +1300, Thomas Munro wrote:
> > I don't have Windows to test, but wouldn't the problem be avoided if
> > people created their cluster with "--locale=tr-TR"?  If yes, EDB's
> > Windows installer should be modified to use the correct locale names.
> > Is anybody from EDB reading this?
>
> That would have the same effect as that patch.  I believe that is the
> right thing to do, but I am not sure about one detail: the lack of
> encoding on the end of the name.  Does the encoding remain the same as
> the traditional one for that language, and stable, or when you haven't
> explicitly named it, might it depend on the registry/control panel
> "ACP", that could in theory change?  If it does, what happens?  Should
> we put it on the end to pin it down?  When we tried that it had some
> effect, but didn't seem to have the expected [by me] effect on ctype
> (eg case conversion of the famous Turkish i); it appears that having
> ACP != LC_TYPE gives some Frankenstein behaviour, but I don't
> understand it, and I think you'd also want to determine in which cases
> strcoll_l() is behaving sensibly with various combinations.  This may
> all be pre-existing stuff well understood by people who worked on the
> Windows port?  I don't have Windows to test.

I see and share the uncertainty.  But isn't tr-TR.1254 the same as
Turkey_Türkiye.1254?  In other words, if we just used the short,
standardized locale names and left the encoding suffixes (with their
potential Frankenstein behavior) as they are now, would the situation
be any worse than it is now?  We could forgo using the mutable locale
names and leave the encoding mystery for a better day.

Yours,
Laurenz Albe