Обсуждение: Fix for edge case in date_bin() function

Поиск
Список
Период
Сортировка

Fix for edge case in date_bin() function

От
Moaaz Assali
Дата:
Hello,

The date_bin() function has a bug where it returns an incorrect binned date when both of the following are true:
1) the origin timestamp is before the source timestamp
2) the origin timestamp is exactly equivalent to some valid binned date in the set of binned dates that date_bin() can return given a specific stride and source timestamp.

For example, consider the following function call:
date_bin('30 minutes'::interval, '2024-01-01 15:00:00'::timestamp, '2024-01-01 17:00:00'::timestamp);

This function call will return '2024-01-01 14:30:00' instead of '2024-01-01 15:00:00' despite '2024-01-01 15:00:00' being the valid binned date for the timestamp '2024-01-01 15:00:00'. This commit fixes that by editing the timestamp_bin() function in timestamp.c file.

The reason this happens is that the code in timestamp_bin() that allows for correct date binning when source timestamp < origin timestamp subtracts one stride in all cases.
However, that is not valid for this case when the source timestamp is exactly equivalent to a valid binned date as in the example mentioned above.

To account for this edge, we simply add another condition in the if statement to not perform the subtraction by one stride interval if the time difference is divisible by the stride.

Best regards,
Moaaz Assali
Вложения

Re: Fix for edge case in date_bin() function

От
Daniel Gustafsson
Дата:
> On 27 Feb 2024, at 09:42, Moaaz Assali <ma5679@nyu.edu> wrote:

> To account for this edge, we simply add another condition in the if statement to not perform the subtraction by one
strideinterval if the time difference is divisible by the stride. 

I only skimmed the patch, but I recommend adding a testcase to it to keep the
regression from reappearing.  src/test/regress/sql/timestamp.sql might be a
good candidate testsuite.

--
Daniel Gustafsson




Re: Fix for edge case in date_bin() function

От
Moaaz Assali
Дата:
Hello Daniel,

I have added a test case for this in timestamp.sql and timestamp.out, and tests pass when using the bug fix patch in the first email.
I have attached a new patch in this email below with the new tests only (doesn't include the bug fix).

P.S. I forgot to CC the mailing list in my previous reply. This is just a copy of it.

Best regards,
Moaaz Assali

On Tue, Feb 27, 2024 at 12:48 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> On 27 Feb 2024, at 09:42, Moaaz Assali <ma5679@nyu.edu> wrote:

> To account for this edge, we simply add another condition in the if statement to not perform the subtraction by one stride interval if the time difference is divisible by the stride.

I only skimmed the patch, but I recommend adding a testcase to it to keep the
regression from reappearing.  src/test/regress/sql/timestamp.sql might be a
good candidate testsuite.

--
Daniel Gustafsson

Вложения

Re: Fix for edge case in date_bin() function

От
Tom Lane
Дата:
Moaaz Assali <ma5679@nyu.edu> writes:
> The date_bin() function has a bug where it returns an incorrect binned date
> when both of the following are true:
> 1) the origin timestamp is before the source timestamp
> 2) the origin timestamp is exactly equivalent to some valid binned date in
> the set of binned dates that date_bin() can return given a specific stride
> and source timestamp.

Hmm, yeah.  The "stride_usecs > 1" test seems like it's a partial
attempt to account for this that is probably redundant given the
additional condition.  Also, can we avoid computing tm_diff %
stride_usecs twice?  Maybe the compiler is smart enough to remove the
common subexpression, but it's a mighty expensive computation if not.

I'm also itching a bit over whether there are integer-overflow
hazards here.  Maybe the range of timestamp is constrained enough
that there aren't, but I didn't look hard.

Also, whatever we do here, surely timestamptz_bin() has the
same problem(s).

            regards, tom lane



Re: Fix for edge case in date_bin() function

От
Tom Lane
Дата:
I wrote:
> Hmm, yeah.  The "stride_usecs > 1" test seems like it's a partial
> attempt to account for this that is probably redundant given the
> additional condition.  Also, can we avoid computing tm_diff %
> stride_usecs twice?  Maybe the compiler is smart enough to remove the
> common subexpression, but it's a mighty expensive computation if not.

I think we could do it like this:

    tm_diff = timestamp - origin;
    tm_modulo = tm_diff % stride_usecs;
    tm_delta = tm_diff - tm_modulo;
    /* We want to round towards -infinity when tm_diff is negative */
    if (tm_modulo < 0)
        tm_delta -= stride_usecs;

Excluding tm_modulo == 0 from the correction is what fixes the
problem.

> I'm also itching a bit over whether there are integer-overflow
> hazards here.  Maybe the range of timestamp is constrained enough
> that there aren't, but I didn't look hard.

Hmm, it's not.  For instance this triggers the overflow check in
timestamp_mi:

regression=# select '294000-01-01'::timestamp - '4714-11-24 00:00:00 BC';
ERROR:  interval out of range
regression=# \errverbose 
ERROR:  22008: interval out of range
LOCATION:  timestamp_mi, timestamp.c:2832

So we ought to guard the subtraction that produces tm_diff similarly.
I suspect it's also possible to overflow int64 while computing
stride_usecs.

            regards, tom lane



Re: Fix for edge case in date_bin() function

От
Moaaz Assali
Дата:
Yeah you are right about the integer overflow.

Consider this query: select date_bin('15 minutes'::interval, timestamp '294276-12-30 10:24:00', timestamp '4000-12-20 23:00:00 BC');

It will return 294276-12-30 10:31:49.551616 when it should be 294276-12-30 10:15:00, which happens because source timestamp is close to INT64_MAX and origin timestamp is negative, causing an integer overflow.
So, the subsequent calculations are wrong.

I fixed the integer overflow and original bug and added test cases in my 3 patches in this reply below:

v2-0001: 
- Fixed both the original bug discussed and the integer overflow issues (and used your suggestion to store the modulo result).
- Any timestamp used will output the correct date_bin() result.
- I have used INT64 -> UINT64 mapping in order to ensure no integer overflows are possible.
- The only additional cost is 3 subtractions/addition to do the INT64 -> UINT64 mapping for timestamp, origin and final result.
- It is probably possible to tackle the integer overflow issue without casting but, from my attempts, it seemed much more convoluted and complex.
- Implemented the fix for both timestamp_bin() and timestamptz_bin().

v2-0002: 
- Added multiple test cases in timestamp.sql for integer overflow by testing with timestamps around INT64_MIN and INT64_MAX.
- Added test case for the original bug where source timestamp is a valid binned date already and does not require the additional stride interval subtraction.

v2-0003:
- Exactly the same as v2-0002 but for timestamptz.sql


Also, I would like to create a new patch on the 2024-03 commitfest, but since I just created my account yesterday I get this error:
"The site you are trying to log in to (commitfest.postgresql.org) requires a cool-off period between account creation and logging in. You have not passed the cool off period yet."

How long is the cool off period, so that I can create a new patch in the commitfest before submissions close after tomorrow.
Alternatively, is it possible for someone to open a new patch on my behalf linking this email thread, so it can be added to the 2024-03 commitfest?


Best regards,
Moaaz Assali


On Tue, Feb 27, 2024 at 11:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
> Hmm, yeah.  The "stride_usecs > 1" test seems like it's a partial
> attempt to account for this that is probably redundant given the
> additional condition.  Also, can we avoid computing tm_diff %
> stride_usecs twice?  Maybe the compiler is smart enough to remove the
> common subexpression, but it's a mighty expensive computation if not.

I think we could do it like this:

    tm_diff = timestamp - origin;
    tm_modulo = tm_diff % stride_usecs;
    tm_delta = tm_diff - tm_modulo;
    /* We want to round towards -infinity when tm_diff is negative */
    if (tm_modulo < 0)
        tm_delta -= stride_usecs;

Excluding tm_modulo == 0 from the correction is what fixes the
problem.

> I'm also itching a bit over whether there are integer-overflow
> hazards here.  Maybe the range of timestamp is constrained enough
> that there aren't, but I didn't look hard.

Hmm, it's not.  For instance this triggers the overflow check in
timestamp_mi:

regression=# select '294000-01-01'::timestamp - '4714-11-24 00:00:00 BC';
ERROR:  interval out of range
regression=# \errverbose
ERROR:  22008: interval out of range
LOCATION:  timestamp_mi, timestamp.c:2832

So we ought to guard the subtraction that produces tm_diff similarly.
I suspect it's also possible to overflow int64 while computing
stride_usecs.

                        regards, tom lane
Вложения

Re: Fix for edge case in date_bin() function

От
Tom Lane
Дата:
Moaaz Assali <ma5679@nyu.edu> writes:
> - I have used INT64 -> UINT64 mapping in order to ensure no integer
> overflows are possible.

I don't think I trust this idea, and in any case it doesn't remove
all the overflow hazards: the reduction of the stride interval to
microseconds can overflow, and the final subtraction of the stride
can too.  I changed it to just do the straightforward thing
(throwing error if the pg_xxx_s64_overflow routines detect error),
and pushed it.  Thanks for the report and patch!

            regards, tom lane



Re: Fix for edge case in date_bin() function

От
Moaaz Assali
Дата:
Hello Tom,

Thanks for the quick patch!

You're right. The stride_usecs calculation, tm_delta += ustride_usecs, and final result calculations can overflow and need a guard.

However, I don't see the issue with the INT64 -> UINT64 mapping. The current implementation results in integer overflows (errors instead after the recent patch) even for valid timestamps where the result of date_bin() is also another valid timestamp. 

On the other hand, the INT64 -> UINT64 mapping solves this issue and allows the input of any valid source and origin timestamps as long as the stride chosen doesn't output invalid timestamps that cannot be represented by Timestamp(tz) type anyways. Since all INT64 values can be mapped 1-to-1 in UINT64, I don't see where the problem is.

Best regards,
Moaaz Assali

Re: Fix for edge case in date_bin() function

От
Tom Lane
Дата:
Moaaz Assali <ma5679@nyu.edu> writes:
> However, I don't see the issue with the INT64 -> UINT64 mapping. The
> current implementation results in integer overflows (errors instead after
> the recent patch) even for valid timestamps where the result of date_bin()
> is also another valid timestamp.

> On the other hand, the INT64 -> UINT64 mapping solves this issue and allows
> the input of any valid source and origin timestamps as long as the stride
> chosen doesn't output invalid timestamps that cannot be represented by
> Timestamp(tz) type anyways. Since all INT64 values can be mapped 1-to-1 in
> UINT64, I don't see where the problem is.

What I don't like about it is that it's complicated (and you didn't
make any effort whatsoever to make the code intelligible or self-
documenting), and that complication has zero real-world benefit.
The only way to hit an overflow in this subtraction is with dates
well beyond 200000 AD.  If you are actually dealing with such dates
(maybe you're an astronomer or a geologist), then timestamp[tz] isn't
the data type for you, because you probably need orders of magnitude
wider range than it's got.

Now I'll freely admit that the pg_xxx_yyy_overflow() functions are
notationally klugy, but they're well documented and they're something
that people would need to understand anyway for a lot of other places
in Postgres.  So I think there's less cognitive load for readers of
the code in the let's-throw-an-error approach than in writing one-off
magic code that in the end can avoid only one of the three possible
overflow cases in this function.

            regards, tom lane