Re: More work on SortSupport for text - strcoll() and strxfrm() caching

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: More work on SortSupport for text - strcoll() and strxfrm() caching
Дата
Msg-id CAM3SWZRWrsGx80kO1f5utWiPhB=rbzE_9ONqCHfgSpUXhy8nJA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: More work on SortSupport for text - strcoll() and strxfrm() caching  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: More work on SortSupport for text - strcoll() and strxfrm() caching  (Peter Geoghegan <pg@heroku.com>)
Список pgsql-hackers
On Tue, Aug 4, 2015 at 12:41 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Interesting work.

Thanks.

> 1. My biggest gripe with this patch is that the comments are not easy
> to understand.

> Of course everybody may prefer something different here; I'm just
> telling you what I think.

I have struggled with trying to put just the right amount of
exposition on the theory behind a particular optimization in source
code comments, and things like that. Since no one is telling me that I
need to write more, clearly I don't have the balance right yet. To a
certain extent, it is a matter of personal style, but I'll try and be
more terse.

> 2. I believe the change to bttextcmp_abbrev() should be pulled out
> into a separate patch and committed separately.  That part  seems like
> a slam dunk.

Makes sense.

> 3. What is the worst case for the strxfrm()-reuse stuff?  I suppose
> it's the case where we have many strings that are long, all
> equal-length, and all different, but only in the last few characters.
> Then the memcmp() is as expensive as possible but never works out.
> How does the patch hold up in that case?

I haven't tested it. I'll get around to it at some point in the next
couple of weeks. I imagine that it's exactly the same as the memcmp()
equality thing because of factors like speculative execution, and the
fact that we need both strings in cache anyway. It's almost exactly
the same story, although unlike the memcmp() opportunistic equality
pre-check thing, this check happens only n times, not n log n times.

I'm quite sure that the cost needs to be virtually zero to go ahead
with the idea. I think it probably is. Note that like the memcmp()
thing, we check string length first, before a memcmp().

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: FSM versus GIN pending list bloat
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Dependency between bgw_notify_pid and bgw_flags