Re: BRIN minmax multi - incorrect distance for infinite timestamp/date
От | Tomas Vondra |
---|---|
Тема | Re: BRIN minmax multi - incorrect distance for infinite timestamp/date |
Дата | |
Msg-id | 14c205c6-fbf4-d2f2-3ea8-a7d5c5d63180@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: BRIN minmax multi - incorrect distance for infinite timestamp/date (Dean Rasheed <dean.a.rasheed@gmail.com>) |
Ответы |
Re: BRIN minmax multi - incorrect distance for infinite timestamp/date
|
Список | pgsql-hackers |
On 10/13/23 11:21, Dean Rasheed wrote: > On Thu, 12 Oct 2023 at 23:43, Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: >> >> Ashutosh Bapat reported me off-list a possible issue in how BRIN >> minmax-multi calculate distance for infinite timestamp/date values. >> >> The current code does this: >> >> if (TIMESTAMP_NOT_FINITE(dt1) || TIMESTAMP_NOT_FINITE(dt2)) >> PG_RETURN_FLOAT8(0); >> > > Yes indeed, that looks wrong. I noticed the same thing while reviewing > the infinite interval patch. > >> so means infinite values are "very close" to any other value, and thus >> likely to be merged into a summary range. That's exactly the opposite of >> what we want to do, possibly resulting in inefficient indexes. >> > > Is this only inefficient? Or can it also lead to wrong query results? > I don't think it can produce incorrect results. It only affects which values we "merge" into an interval when building the summaries. >> Attached is a patch fixing this >> > > I wonder if it's actually necessary to give infinity any special > handling at all for dates and timestamps. For those types, "infinity" > is actually just INT_MIN/MAX, which compares correctly with any finite > value, and will be much larger/smaller than any common value, so it > seems like it isn't necessary to give "infinite" values any special > treatment. That would be consistent with date_cmp() and > timestamp_cmp(). > Right, but .... > Something else that looks wrong about that BRIN code is that the > integer subtraction might lead to overflow -- it is subtracting two > integer values, then casting the result to float8. It should cast each > input before subtracting, more like brin_minmax_multi_distance_int8(). > ... it also needs to fix this, otherwise it overflows. Consider delta = dt2 - dt1; and assume dt1 is INT64_MIN, or that dt2 is INT64_MAX. > IOW, I think brin_minmax_multi_distance_date/timestamp could be made > basically identical to brin_minmax_multi_distance_int4/8. > Right. Attached is a patch doing it this way. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
В списке pgsql-hackers по дате отправления: