Обсуждение: Re: PostgreSQL Windows Installer defaults to "English_United States.1252" when choosing locale starting with "English"

Поиск
Список
Период
Сортировка
Hi,

I'll have a look and revert. 

On Sun, Jul 6, 2025 at 5:09 PM Ben Caspi <benc@aidoc.com> wrote:
Hi,

In the past we reached out to you about the PostgreSQL Windows installer not having an option to modify the locale on installation, resulting in failures during upgrades due to mismatched locale versions.

As a result you released installer version 15.13, which included a more inclusive locale flag which solves our problem.

However, we recently noticed that when picking a locale starting with "English", the value post installation will always be "English_United States.1252".

For example, we installed using locale "English, United Kingdom" and expected the locale value post installation to be "English_United Kingdom.1252". However, we saw it was installed with the "English_United States.1252" locale instead.

We then tested multiple different locales and found this issue repeats only when choosing locales starting with "English".

Is this the intended behavior? If not, we would appreciate a fix to this issue as it's blocking us from upgrading our machines holding older PostgreSQL versions.

Thank you!

--

photo

Ben Caspi
DevOps Engineer

www.aidoc.com  |  benc@aidoc.com

linkedin

twitter

App Banner Image

 


--
Sandeep Thakkar


Yes, you are correct. I got the same result. But it was correct when I chose the BCP-47 name like en-uk. Probably something to do with how initdb is handling the long names. 

On Mon, Jul 7, 2025 at 5:50 PM Sandeep Thakkar <sandeep.thakkar@enterprisedb.com> wrote:
Hi,

I'll have a look and revert. 

On Sun, Jul 6, 2025 at 5:09 PM Ben Caspi <benc@aidoc.com> wrote:
Hi,

In the past we reached out to you about the PostgreSQL Windows installer not having an option to modify the locale on installation, resulting in failures during upgrades due to mismatched locale versions.

As a result you released installer version 15.13, which included a more inclusive locale flag which solves our problem.

However, we recently noticed that when picking a locale starting with "English", the value post installation will always be "English_United States.1252".

For example, we installed using locale "English, United Kingdom" and expected the locale value post installation to be "English_United Kingdom.1252". However, we saw it was installed with the "English_United States.1252" locale instead.

We then tested multiple different locales and found this issue repeats only when choosing locales starting with "English".

Is this the intended behavior? If not, we would appreciate a fix to this issue as it's blocking us from upgrading our machines holding older PostgreSQL versions.

Thank you!

--

photo

Ben Caspi
DevOps Engineer

www.aidoc.com  |  benc@aidoc.com

linkedin

twitter

App Banner Image

 


--
Sandeep Thakkar




--
Sandeep Thakkar


On Thu, Jul 10, 2025 at 12:41 AM Sandeep Thakkar
<sandeep.thakkar@enterprisedb.com> wrote:
> Yes, you are correct. I got the same result. But it was correct when I chose the BCP-47 name like en-uk. Probably
somethingto do with how initdb is handling the long names. 

What's the exact initdb command in this case?  I'm a bit confused
about ""English, United Kingdom" vs "English_United Kingdom.1252".  I
think maybe the Windows C library is doing this, because that first
form isn't really a supported form, and it only manages to grok the
first word with some best-match scheme?  I don't have Windows but I
just pushed a stupid test program to CI to test that theory:

#include <locale.h>
#include <stdio.h>
#include <stdlib.h>

const char *or_null(const char *s)
{
    return s == NULL ? "<null>" : s;
}

int
main(int argc, char *argv[])
{
    if (setlocale(LC_ALL, "English, United Kingdom") == NULL)
        printf("error 1\n");
    printf("got: %s\n", or_null(setlocale(LC_ALL, NULL)));
    if (setlocale(LC_ALL, "English_United Kingdom") == NULL)
        printf("error 2\n");
    printf("got: %s\n", or_null(setlocale(LC_ALL, NULL)));
    if (setlocale(LC_ALL, "English_United Kingdom.1252") == NULL)
        printf("error 3\n");
    printf("got: %s\n", or_null(setlocale(LC_ALL, NULL)));
    return EXIT_SUCCESS;
}

And lo and behold it printed:

got: English_United States.1252
got: English_United Kingdom.1252
got: English_United Kingdom.1252

Apparently it really needs that underscore.





On Thu, Jul 10, 2025 at 7:24 AM Thomas Munro <thomas.munro@gmail.com> wrote:
On Thu, Jul 10, 2025 at 12:41 AM Sandeep Thakkar
<sandeep.thakkar@enterprisedb.com> wrote:
> Yes, you are correct. I got the same result. But it was correct when I chose the BCP-47 name like en-uk. Probably something to do with how initdb is handling the long names.

What's the exact initdb command in this case?  I'm a bit confused
about ""English, United Kingdom" vs "English_United Kingdom.1252".  I
think maybe the Windows C library is doing this, because that first
form isn't really a supported form, and it only manages to grok the
first word with some best-match scheme?  I don't have Windows but I
just pushed a stupid test program to CI to test that theory:

#include <locale.h>
#include <stdio.h>
#include <stdlib.h>

const char *or_null(const char *s)
{
    return s == NULL ? "<null>" : s;
}

int
main(int argc, char *argv[])
{
    if (setlocale(LC_ALL, "English, United Kingdom") == NULL)
        printf("error 1\n");
    printf("got: %s\n", or_null(setlocale(LC_ALL, NULL)));
    if (setlocale(LC_ALL, "English_United Kingdom") == NULL)
        printf("error 2\n");
    printf("got: %s\n", or_null(setlocale(LC_ALL, NULL)));
    if (setlocale(LC_ALL, "English_United Kingdom.1252") == NULL)
        printf("error 3\n");
    printf("got: %s\n", or_null(setlocale(LC_ALL, NULL)));
    return EXIT_SUCCESS;
}

And lo and behold it printed:

got: English_United States.1252
got: English_United Kingdom.1252
got: English_United Kingdom.1252

Apparently it really needs that underscore.

Here is the initdb command:
---

Executing: "C:\Program Files\PostgreSQL\17\bin\initdb.exe" --pgdata="C:\Program Files\PostgreSQL\17\data" --username="postgres" --encoding=UTF8 --pwfile="C:\Users\sandeep\AppData\Local\Temp\postgresql_installer_c27ed92f26\212da2e5.tmp" --auth=scram-sha-256 --locale="English, United Kingdom"

The files belonging to this database system will be owned by user "sandeep".
This user must also own the server process.

The database cluster will be initialized with locale "English_United States.1252".
The default text search configuration will be set to "english".
...
...
--

If initdb needs the underscore, then I guess those names need to be converted in the script:
https://github.com/EnterpriseDB/edb-installers/blob/REL-17/server/scripts/windows/getlocales.ps1


--
Sandeep Thakkar


Hi Ben,

The recent minor release includes the fix for this. The installer script converts the "English, <Country>" with "<English_<Country>" before passing the locale to initdb.exe. Please have a look.

Thanks for the report.

On Sun, Jul 13, 2025 at 4:22 PM Sandeep Thakkar <sandeep.thakkar@enterprisedb.com> wrote:


On Thu, Jul 10, 2025 at 7:24 AM Thomas Munro <thomas.munro@gmail.com> wrote:
On Thu, Jul 10, 2025 at 12:41 AM Sandeep Thakkar
<sandeep.thakkar@enterprisedb.com> wrote:
> Yes, you are correct. I got the same result. But it was correct when I chose the BCP-47 name like en-uk. Probably something to do with how initdb is handling the long names.

What's the exact initdb command in this case?  I'm a bit confused
about ""English, United Kingdom" vs "English_United Kingdom.1252".  I
think maybe the Windows C library is doing this, because that first
form isn't really a supported form, and it only manages to grok the
first word with some best-match scheme?  I don't have Windows but I
just pushed a stupid test program to CI to test that theory:

#include <locale.h>
#include <stdio.h>
#include <stdlib.h>

const char *or_null(const char *s)
{
    return s == NULL ? "<null>" : s;
}

int
main(int argc, char *argv[])
{
    if (setlocale(LC_ALL, "English, United Kingdom") == NULL)
        printf("error 1\n");
    printf("got: %s\n", or_null(setlocale(LC_ALL, NULL)));
    if (setlocale(LC_ALL, "English_United Kingdom") == NULL)
        printf("error 2\n");
    printf("got: %s\n", or_null(setlocale(LC_ALL, NULL)));
    if (setlocale(LC_ALL, "English_United Kingdom.1252") == NULL)
        printf("error 3\n");
    printf("got: %s\n", or_null(setlocale(LC_ALL, NULL)));
    return EXIT_SUCCESS;
}

And lo and behold it printed:

got: English_United States.1252
got: English_United Kingdom.1252
got: English_United Kingdom.1252

Apparently it really needs that underscore.

Here is the initdb command:
---

Executing: "C:\Program Files\PostgreSQL\17\bin\initdb.exe" --pgdata="C:\Program Files\PostgreSQL\17\data" --username="postgres" --encoding=UTF8 --pwfile="C:\Users\sandeep\AppData\Local\Temp\postgresql_installer_c27ed92f26\212da2e5.tmp" --auth=scram-sha-256 --locale="English, United Kingdom"

The files belonging to this database system will be owned by user "sandeep".
This user must also own the server process.

The database cluster will be initialized with locale "English_United States.1252".
The default text search configuration will be set to "english".
...
...
--

If initdb needs the underscore, then I guess those names need to be converted in the script:
https://github.com/EnterpriseDB/edb-installers/blob/REL-17/server/scripts/windows/getlocales.ps1


--
Sandeep Thakkar




--
Sandeep Thakkar