Re: Locale timings
От | Michael Tiemann |
---|---|
Тема | Re: Locale timings |
Дата | |
Msg-id | 3C029AEB.2030902@redhat.com обсуждение исходный текст |
Ответ на | Locale timings (Peter Eisentraut <peter_e@gmx.net>) |
Список | pgsql-hackers |
This is a common way of doing things inside glibc, and the happy result is that if you really want to build a non-locale-aware system, you can use a compile-time option that replaces the "locale_is_not_C" test with a constant. It makes for more maintainable code because there's less chance for bitrot in the usual case. M Peter Eisentraut wrote: > I did some "benchmarks" to check whether --enable-locale with LC_ALL=C is > just as fast as --disable-locale, to possibly justify making locale > support the default. This test only covers locale-aware comparisons, > which seems to be the critical aspect for all intents and purposes. > > I loaded a table of a single text column with 454240 rows of English > words. The table had a size of 21.5 MB. The values were explicitly > de-sorted, but the order was the same across all test runs. Then I ran > SELECT * FROM test ORDER BY 1; and timed the wall-clock response time a > few times. All configuration parameters were left at the default. > > The averaged results follow. Some logarithmic buffering cleverness > appeared to surface, but the results are still distinct enough to be > useful. > > no locale: 58s > locale=C: 78s (ca. 33% slower) > locale=en_US: 118s (ca. 100% slower) > > This confused me, because in my C library a strcoll() call with locale=C > is handed to strcmp() quite directly. A look into varlena.c:varstr_cmp() > shows that the locale-aware path does some extra copying because there is > no strncoll() function we can use with non-terminated strings. > > For testing's sake I replaced the two palloc() calls in that function with > alloca(), which is presumably the fastest possible memory allocator. > Result: > > locale=C,alloca: 67s (ca. 15% slower) > > This shows that we're wasting quite a bit of time allocating memory -- > probably not only in this place. I'm pretty sure that the majority of the > rest of the gap comes from the memcpy() operations. Not that there's a > whole lot we can do about either of these things. > > However, I feel that we could reasonably cope with this situation by > replacing > > #ifdef USE_LOCALE > /* locale-aware code */ > #else > /* non-locale code */ > #endif > > with > > if (locale_is_not_C) > { > /* locale-ware code */ > } > else > { > /* non-locale code */ > } > > This practice should have minuscule impact, and it's probably the plan for > the multibyte side of things as well. > >
В списке pgsql-hackers по дате отправления: