Обсуждение: Path case sensitivity on windows

Поиск
Список
Период
Сортировка

Path case sensitivity on windows

От
Magnus Hagander
Дата:
Bug #4694
(http://archives.postgresql.org/message-id/200903050848.n258mVgm046178@wwwmaster.postgresql.org)
shows a very strange behaviour on windows when you use a different case PATH

>From what I can tell, this is because dir_strcmp() is case sensitive,
but paths on windows are really case-insensitive.

Attached patch fixes this in my testcase. Can anybody spot something
wrong with it? If not, I'll apply once I've finished my test runs:-)

//Magnus
diff --git a/src/port/path.c b/src/port/path.c
index 708306d..d7bd353 100644
--- a/src/port/path.c
+++ b/src/port/path.c
@@ -427,7 +427,12 @@ dir_strcmp(const char *s1, const char *s2)
 {
     while (*s1 && *s2)
     {
+#ifndef WIN32
         if (*s1 != *s2 &&
+#else
+            /* On windows, paths are case-insensitive */
+        if (tolower(*s1) != tolower(*s2) &&
+#endif
             !(IS_DIR_SEP(*s1) && IS_DIR_SEP(*s2)))
             return (int) *s1 - (int) *s2;
         s1++, s2++;

Re: Path case sensitivity on windows

От
Tom Lane
Дата:
Magnus Hagander <magnus@hagander.net> writes:
> Attached patch fixes this in my testcase. Can anybody spot something
> wrong with it?

It depends on tolower(), which is going to have LC_CTYPE-dependent
behavior, which is surely wrong?
        regards, tom lane


Re: Path case sensitivity on windows

От
Magnus Hagander
Дата:
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> Attached patch fixes this in my testcase. Can anybody spot something
>> wrong with it?
> 
> It depends on tolower(), which is going to have LC_CTYPE-dependent
> behavior, which is surely wrong?

Not sure, really :) That's the encoding we'd get the paths in in the
first place, is it not?

Or are you just saying we should be using pg_tolower()?  (which I forgot
about yet again)

//Magnus



Re: Path case sensitivity on windows

От
Tom Lane
Дата:
Magnus Hagander <magnus@hagander.net> writes:
> Tom Lane wrote:
>> It depends on tolower(), which is going to have LC_CTYPE-dependent
>> behavior, which is surely wrong?

> Or are you just saying we should be using pg_tolower()?  (which I forgot
> about yet again)

Well, I'd be happier with pg_tolower, because I know what it does.
But the real question here is what does "case insensitivity" on
file names actually mean in Windows --- ie, what happens to non-ASCII
letters?
        regards, tom lane


Re: Path case sensitivity on windows

От
Magnus Hagander
Дата:
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> Tom Lane wrote:
>>> It depends on tolower(), which is going to have LC_CTYPE-dependent
>>> behavior, which is surely wrong?
> 
>> Or are you just saying we should be using pg_tolower()?  (which I forgot
>> about yet again)
> 
> Well, I'd be happier with pg_tolower, because I know what it does.
> But the real question here is what does "case insensitivity" on
> file names actually mean in Windows --- ie, what happens to non-ASCII
> letters?

The filesystem itself is UTF-16. I would assume the "system default"
locale controls the case insensitivity, but I'm not sure about that.

Reading up some, it seems the collation is actually stored in a hidden
file on the NTFS volume... It seems to differ between different versions
of windows from what I can tell, but since this is written to the fs,
it's ok.

I have not found a way to actually *get* the locale.. Or even to compare
two filenames. There is a function called GetFullPathName(), but I'm not
sure how to use it for this.

However. I don't think it's really critical that we deal with all corner
cases for this. It's not likely that the user would be using any really
weird locale-specific combinations *differently* in the PATH variable vs
the commandline, or something like that...

And this only shows up when the binary is found in the PATH and not
through a fully specified directory. This is, AFAICT, the only case
where they can differ. This is the reason why we haven't had any reports
of this before - nobody using the installer, or doing even a "normal
style" install would ever end up in this situation.

//Magnus



Re: Path case sensitivity on windows

От
Tom Lane
Дата:
Magnus Hagander <magnus@hagander.net> writes:
> And this only shows up when the binary is found in the PATH and not
> through a fully specified directory. This is, AFAICT, the only case
> where they can differ. This is the reason why we haven't had any reports
> of this before - nobody using the installer, or doing even a "normal
> style" install would ever end up in this situation.

Hmm.  Well, if we use pg_tolower then it will only do the right thing
for ASCII letters, but it seems like non-ASCII in the path leading to
the postgres binaries would be pretty dang unusual.  (And I am not
convinced tolower() would get it right either --- it certainly won't
if the encoding is multibyte.)

On balance I'd suggest just using pg_tolower and figuring it's close
enough.
        regards, tom lane


Re: Path case sensitivity on windows

От
Peter Eisentraut
Дата:
On Thursday 02 April 2009 18:29:45 Tom Lane wrote:
> Hmm.  Well, if we use pg_tolower then it will only do the right thing
> for ASCII letters, but it seems like non-ASCII in the path leading to
> the postgres binaries would be pretty dang unusual.

Well, Windows localizes the directory names like C:\Program Files, so it is 
entirely plausible to have non-ASCII path names across the board in certain 
locales.