Обсуждение: Periods

Поиск
Список
Период
Сортировка

Periods

От
Vik Fearing
Дата:
SQL:2011 introduced the concept of a "period". It takes two existing columns and basically does the same thing as our range types except there is no new storage.  I believe if Jeff Davis had given us range types a few years later than he did, it would have been using periods.

Attached is a WIP patch that I have been working on.  The only thing left is completing periods on inherited tables, and finishing up pg_dump.  I'm posting this now just to make sure my basic foundation is sound, and to let people know that I'm working on this.

The patch itself doesn't have any functionality, it just allows periods to be defined.  With that, there are several things that we can do: SYSTEM_TIME periods, which are explicitly not allowed by this patch, will allow us to do SQL standard versioned tables, and also allows some time travel functionality.  Application periods can be used in PRIMARY/UNIQUE keys, foreign keys, and give nice new features to UPDATE and DELETE.  They also allow "period predicates" which are the same kind of operations we already have for range types.  All of that is for future patches that build on the infrastructure presented in this patch.

The SQL standard restricts period columns to dates or timestamps, but I'm allowing anything that has a btree operator class, as is the PostgreSQL way. System periods, once allowed, will only be timestamptz though.  Unfortunately, I had to fully reserve the word PERIOD for this.

I'm looking for comments on everything except the pg_dump stuff, keeping in mind that inheritance is not finished either.

Thanks!



This is patch is based off of 71b349aef4.
Вложения

Re: Periods

От
Pavel Stehule
Дата:


2018-05-26 22:56 GMT+02:00 Vik Fearing <vik.fearing@protonmail.com>:
SQL:2011 introduced the concept of a "period". It takes two existing columns and basically does the same thing as our range types except there is no new storage.  I believe if Jeff Davis had given us range types a few years later than he did, it would have been using periods.

Attached is a WIP patch that I have been working on.  The only thing left is completing periods on inherited tables, and finishing up pg_dump.  I'm posting this now just to make sure my basic foundation is sound, and to let people know that I'm working on this.

The patch itself doesn't have any functionality, it just allows periods to be defined.  With that, there are several things that we can do: SYSTEM_TIME periods, which are explicitly not allowed by this patch, will allow us to do SQL standard versioned tables, and also allows some time travel functionality.  Application periods can be used in PRIMARY/UNIQUE keys, foreign keys, and give nice new features to UPDATE and DELETE.  They also allow "period predicates" which are the same kind of operations we already have for range types.  All of that is for future patches that build on the infrastructure presented in this patch.

The SQL standard restricts period columns to dates or timestamps, but I'm allowing anything that has a btree operator class, as is the PostgreSQL way. System periods, once allowed, will only be timestamptz though.  Unfortunately, I had to fully reserve the word PERIOD for this.

I'm looking for comments on everything except the pg_dump stuff, keeping in mind that inheritance is not finished either.

Thanks!

looks interesting

Regards

Pavel




This is patch is based off of 71b349aef4.

Re: Periods

От
Paul A Jungwirth
Дата:
On Sat, May 26, 2018 at 1:56 PM, Vik Fearing <vik.fearing@protonmail.com> wrote:
> SQL:2011 introduced the concept of a "period". It takes two existing columns
> and basically does the same thing as our range types except there is no new
> storage.  I believe if Jeff Davis had given us range types a few years later
> than he did, it would have been using periods.

Hi Vik, I'm really glad to see someone working on temporal features!
I've only dabbled in Postgres hacking, but I've been following
temporal database research for several years, so I hope you won't mind
my comments. I already shared some thoughts on this other thread:

http://www.postgresql-archive.org/SQL-2011-Valid-Time-Support-td6020221.html

I would love to see Postgres support the standard but *also* let
people use range types. I'm not sure I agree that Jeff Davis would
have preferred the SQL:2011 period idea, which is an extra-relational
concept. Since it is attached to a table, it doesn't carry through
cleanly to a result set, so what happens if you want to apply temporal
features to a view, subquery, CTE, or set-returning function? A range
on the other hand is just a type, so as long as temporal operators
support that type, everything still composes nicely and just works.
The Date/Darwen/Lorenztos book has more to say about that, and I think
it's worth reading. They are unrealistically extreme in their purism,
but here I think they have some good points---points they also raised
against an earlier proposed temporal-SQL standard circa 1998. By the
way here are some thoughts Jeff shared with me about that book, which
he says inspired range types:

https://news.ycombinator.com/item?id=14738655

I understand that your patch is just to allow defining periods, but I
thought I'd share my own hopes earlier rather than later, in case you
are doing more work on this. Also, it might be nice if Postgres let
you also define periods from a single range column, so that people who
want to use intervals can still stick closer to the standard---I
dunno, just an idea.

Also, this may not be very helpful, but I started an extension to
support temporal foreign keys here:

https://github.com/pjungwir/time_for_keys

It uses intervals, not periods, but maybe you can steal some ideas.
:-) I have a half-finished branch porting it from plpgsql to C, so
that I could give them more catalog integration, and also I have hopes
of defining temporal primary keys, although that isn't implemented
yet. Anyway, I mention it because you said, "Application periods can
be used in PRIMARY/UNIQUE keys, foreign keys," but feel free to ignore
it. :-)

In general, I would love Postgres to have some lower-level primitives
like range types and the Dingös operators, and then build the
SQL:2011 support on top of those. I'm happy to contribute work to help
make that happen, although I'd probably need to work with someone with
more Postgres hacking experience to get it done.

Yours,
Paul


Re: Periods

От
Paul A Jungwirth
Дата:
On Tue, Jun 5, 2018 at 12:47 PM, Paul A Jungwirth
<pj@illuminatedcomputing.com> wrote:
> Also, this may not be very helpful, but I started an extension to
> support temporal foreign keys here:
>
> https://github.com/pjungwir/time_for_keys
>
> It uses intervals, not periods, but maybe you can steal some ideas.

Sorry for two emails but I wanted to add: the more stealable thing are
the tests, which are pretty thorough and took a lot of hours to write.
They are yours if you want them. :-)

Paul


Re: Periods

От
Vik Fearing
Дата:



On June 5, 2018 9:47 PM, Paul A Jungwirth pj@illuminatedcomputing.com wrote:

On Sat, May 26, 2018 at 1:56 PM, Vik Fearing vik.fearing@protonmail.com wrote:
SQL:2011 introduced the concept of a "period". It takes two existing columns
and basically does the same thing as our range types except there is no new
storage. I believe if Jeff Davis had given us range types a few years later
than he did, it would have been using periods.
Hi Vik, I'm really glad to see someone working on temporal features!
I've only dabbled in Postgres hacking, but I've been following
temporal database research for several years, so I hope you won't mind
my comments. I already shared some thoughts on this other thread:

Hi! No, of course I don't mind your comments; I welcome them. I had not seen that thread so I'll go take a look at it.

I would love to see Postgres support the standard but also let
people use range types. I'm not sure I agree that Jeff Davis would
have preferred the SQL:2011 period idea, which is an extra-relational
concept. Since it is attached to a table, it doesn't carry through
cleanly to a result set, so what happens if you want to apply temporal
features to a view, subquery, CTE, or set-returning function?

As far as I can see, the standard doesn't say what should happen if you select a period, or even if that's possible.  It does however define how to create a period not attached to a table (PERIOD <left paren> <period start value> <comma> <period end value> <right paren>) so it would be possible to use that for views, subqueries, and the rest of your examples.

A range on the other hand is just a type, so as long as temporal operators
support that type, everything still composes nicely and just works.
The Date/Darwen/Lorenztos book has more to say about that, and I think
it's worth reading. They are unrealistically extreme in their purism,
but here I think they have some good points---points they also raised
against an earlier proposed temporal-SQL standard circa 1998. By the
way here are some thoughts Jeff shared with me about that book, which
he says inspired range types:

Thanks, I will read this, too.

I understand that your patch is just to allow defining periods, but I
thought I'd share my own hopes earlier rather than later, in case you
are doing more work on this.

Yes, I plan on doing much more work on this.  My goal is to implement (by myself or with help from other) the entire SQL:2016 spec on periods and system versioned tables.  This current patch is just infrastructure.

Also, it might be nice if Postgres let
you also define periods from a single range column, so that people who
want to use intervals can still stick closer to the standard---I
dunno, just an idea.

That's a nice idea, but I'm not sure how I'd fit it into the pg_period catalog which expects two columns.

Also, this may not be very helpful, but I started an extension to
support temporal foreign keys here:
It uses intervals, not periods, but maybe you can steal some ideas.
:-) I have a half-finished branch porting it from plpgsql to C, so
that I could give them more catalog integration, and also I have hopes
of defining temporal primary keys, although that isn't implemented
yet. Anyway, I mention it because you said, "Application periods can
be used in PRIMARY/UNIQUE keys, foreign keys," but feel free to ignore
it. :-)

While I'm waiting for comments on how best to do inheritance and other aspects of my patch, I'm working on getting PRIMARY/UNIQUE keys with periods.  That's far from finished though as it is touching parts of the code that I have never looked at before.

In general, I would love Postgres to have some lower-level primitives
like range types and the Dingös operators, and then build the
SQL:2011 support on top of those. I'm happy to contribute work to help
make that happen, although I'd probably need to work with someone with
more Postgres hacking experience to get it done.

Any help you can give me (or that I could give you) is greatly appreciated.  I'm hoping we can get *something* in v12 with periods.

Re: Periods

От
Alvaro Herrera
Дата:
Hello Vik,

On 2018-Jun-05, Vik Fearing wrote:

> > I understand that your patch is just to allow defining periods, but I
> > thought I'd share my own hopes earlier rather than later, in case you
> > are doing more work on this.
> 
> Yes, I plan on doing much more work on this.  My goal is to implement
> (by myself or with help from other) the entire SQL:2016 spec on
> periods and system versioned tables.  This current patch is just
> infrastructure.

Have you had the chance to work on this?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Periods

От
Vik Fearing
Дата:



On Thursday, July 4, 2019 8:04 PM, Alvaro Herrera alvherre@2ndquadrant.com wrote:

Hello Vik,
On 2018-Jun-05, Vik Fearing wrote:
I understand that your patch is just to allow defining periods, but I
thought I'd share my own hopes earlier rather than later, in case you
are doing more work on this.
Yes, I plan on doing much more work on this. My goal is to implement
(by myself or with help from other) the entire SQL:2016 spec on
periods and system versioned tables. This current patch is just
infrastructure.
Have you had the chance to work on this?

Hi Alvaro,

I've been working on this as an extension instead.  It allows me to do some stuff in plpgsql which is much easier for me.


I would love to have all this in core (especially since MariaDB 10.4 just became the first open source database to get all of this functionality), but something is better than nothing.
--
Vik Fearing

Re: Periods

От
Alvaro Herrera
Дата:
Hello Vik, Paul,

On 2019-Jul-04, Vik Fearing wrote:

> I've been working on this as an extension instead.  It allows me to do some stuff in plpgsql which is much easier for
me.
> 
> https://github.com/xocolatl/periods/

Hmm, okay.

> I would love to have all this in core (especially since MariaDB 10.4
> just became the first open source database to get all of this
> functionality), but something is better than nothing.

Agreed on all accounts.

I think that the functionality in your patch is already integrated in
Paul's patch for temporal PK/FK elsewhere ... is that correct, or is
this patch orthogonal to that work?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Periods

От
Paul A Jungwirth
Дата:
On Thu, Jul 4, 2019 at 11:44 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> I think that the functionality in your patch is already integrated in
> Paul's patch for temporal PK/FK elsewhere ... is that correct, or is
> this patch orthogonal to that work?

Hi Vik, I wasn't aware that you had moved your work over to an
extension. It looks like you've made a lot of progress! I'd be eager
to work with you on getting this into core.

Alvaro: it's a mix of orthogonal and not orthogonal. :-) My work lets
you use range types directly in SQL:2011 constructs, whereas Vik's
work lets you use SQL:2011 PERIODs. I would like to support *both*, so
I've been intending to add Vik's original patch letting you define
PERIODs to my own work.

There is no conflict in supporting both ranges and PERIODs because you
can't name a PERIOD the same as an existing column.

Behind the scenes the easiest way to implement PERIODs is with range
types, so there would be very little duplication to permit both.

PERIODs suffer from being outside relational theory as a
quasi-attribute. You especially see this in composing queries & result
sets. For example with ranges you could say this:

WITH x AS (
  SELECT * FROM y
  FOR PORTION OF some_range FROM t1 TO t2
)
SELECT * FROM x
FOR PORTION OF some_range FROM t3 TO t4
;

But with PERIODs you can't because a PERIOD is not included in `SELECT
*`. Also it's unclear how "attached" it is to a table definition. Can
you say `SELECT *, some_period FROM x`? If so can you then use `FOR
PORTION OF` on that result set? Can you construct an on-the-fly PERIOD
expression? Can you pass a period to a function as a parameter? Can a
function return a period? Can you ORDER BY a period? GROUP BY one? Can
you cast to/from a period? Can you ask a period for its high/low
values? Do we treat a PERIOD as a whole new datatype? Can you define a
real column of type PERIOD? I haven't found text from the standard
that answers most of these questions. (The standard does say you can
construct a `PERIOD(t1, t2)` expression but apparently only inside a
"period predicate".)

Also you can't define PERIODs on types other than
date/timestamp/timestamptz, unlike ranges.

Also PERIODs require a sentinel value to mean "unbounded" (e.g.
31-JAN-9999) whereas ranges let you express that with a NULL.
(Postgres does have Infinity and -Infinity for timestamp types but
I've noticed that client programming languages can't always express
ranges with those values.)

Personally I intend to use ranges any time I build temporal tables,
but supporting PERIODs might have value for people more interested in
database portability or someone migrating from elsewhere to Postgres.

I had some conversations at PGCon that I felt validated the
permit-PERIODS-or-ranges approach, so I'm about ready to expand my
patch to handle PERIODs too. For that I would love to draw on Vik's
work so far. I think his original patch against core is more likely to
be helpful than the extension, but I'll certainly consult both, and
Vik if you have any advice let me know! :-)

A big difference between a temporal extension vs temporal features in
core is implementing DML. An extension pretty much requires you to use
INSTEAD OF triggers. Also Vik's README points out that implementing
temporal DELETE is hard that way. In core I believe you'd do temporal
DML in the executor node. (That's my working theory anyway; I'm still
new to that part of the code.)

The first thing on my TODO list is to write a blog post comparing how
other RDMBSes handle PERIODs and other temporal features. Besides the
questions above, how does a trigger work on a table? For example when
you DELETE in the middle of a range/period, and it becomes an INSERT
plus an UPDATE, I *believe* you still fire the DELETE trigger. And you
need to set the NEW/OLD tuples appropriately. You *don't* fire any
INSERT & UPDATE triggers. The standard isn't super explicit but that's
my take on it, and I want to write down what other vendors are doing.

Yours,
Paul