Collation order for btree-indexable datatypes

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Collation order for btree-indexable datatypes
Дата
Msg-id 16792.988839483@sss.pgh.pa.us
обсуждение исходный текст
Ответы Re: Collation order for btree-indexable datatypes  (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
Re: Collation order for btree-indexable datatypes  (Bruce Momjian <pgman@candle.pha.pa.us>)
Список pgsql-hackers
To avoid getting into states where a btree index is corrupt (or appears
that way), it is absolutely critical that the datatype provide a unique,
consistent sort order.  In particular, the operators = <> < <= > >= had
better all agree with each other and with the 3-way-comparison support
function about the ordering of any two non-NULL data values.

After tracing some Assert failures in the new planner statistics code
I'm working on, I have realized that several of our existing datatypes
fail to meet this fundamental requirement, and therefore are prone to
serious misbehavior when trying to index "weird" values.  In particular,
type NUMERIC does not return consistent results for comparisons
involving "NaN" values, and several of the date/time types do not return
consistent results for comparisons involving "INVALID" values.
(Example: numeric_cmp will assert that two NaNs are equal, whereas
numeric_eq will assert that they aren't.  Worse, numeric_cmp will assert
that a NaN is equal to any non-NaN, too.  The date/time routines avoid
the latter mistake but make the former one.)

I am planning to fix this by ensuring that all these operations agree
on an (arbitrarily chosen) sort order for the "weird" values of these
types.  What I'm wondering about is whether to insert the fixes into
7.1.1 or wait for 7.2.  In theory changing the sort order might break
existing user indexes, and should therefore be avoided until an initdb
is needed.  But: any indexes that contain these values are likely broken
already, since in fact we don't have a well-defined sort order right now
for these values.

A closely related problem is that the "current time" special value
supported by several of the date/time datatypes is inherently not
compatible with being indexed, since its sort order relative to
ordinary time values keeps changing.  We had discussed removing this
special case, and I think agreed to do so, but it hasn't happened yet.

What I'm inclined to do is force consistency of the comparison operators
now (for 7.1.1) and then remove "current time" for 7.2, but perhaps it'd
be better to leave the whole can of worms alone until 7.2.  Comments
anyone?
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: New Linux xfs/reiser file systems
Следующее
От: Hannu Krosing
Дата:
Сообщение: Re: v7.1 error ... SELECT converted to a COPY?