Обсуждение: Partitioning on the date part of a timestamp & PK issues

Поиск
Список
Период
Сортировка

Partitioning on the date part of a timestamp & PK issues

От
Wells Oliver
Дата:
I have a table like so:

CREATE TABLE t (
    guid uuid not null,
    seq smallint not null,
    tid smallint not null,
    ts timestamp without time zone not null,
    x real,
    y real,
    z real,
    primary key (guid, tid, seq)
) partition by range (extract(date from ts));

Which results in the error:

ERROR:  unsupported PRIMARY KEY constraint with partition key definition
DETAIL:  PRIMARY KEY constraints cannot be used when partition keys include expressions.

Is there a suitable way to get around this? I can make an additional date col in the table to use for the partition range, but given the amount of data we will have, I am trying to be cognizant of storage concerns.

The PK is the PK, so I'm not sure what options I have there.

Thanks.

--

Re: Partitioning on the date part of a timestamp & PK issues

От
Holger Jakobs
Дата:


Am 15. Juli 2021 22:10:01 MESZ schrieb Wells Oliver <wells.oliver@gmail.com>:
I have a table like so:

CREATE TABLE t (
    guid uuid not null,
    seq smallint not null,
    tid smallint not null,
    ts timestamp without time zone not null,
    x real,
    y real,
    z real,
    primary key (guid, tid, seq)
) partition by range (extract(date from ts));

Which results in the error:

ERROR:  unsupported PRIMARY KEY constraint with partition key definition
DETAIL:  PRIMARY KEY constraints cannot be used when partition keys include expressions.

Is there a suitable way to get around this? I can make an additional date col in the table to use for the partition range, but given the amount of data we will have, I am trying to be cognizant of storage concerns.

The PK is the PK, so I'm not sure what options I have there.

Thanks.

--

--
Maybe you could sister separate date and time and use date as partition key. The timestamp column could be generated in a view as datecol + timecol.

If new data is entered, it could be separated into the two columns via trigger.

Haven't tried it, bit seems feasible.

Regards,
Holger

--
Holger Jakobs, Bergisch Gladbach
+49 178 9759012
- sent from mobile, therefore short -

Re: Partitioning on the date part of a timestamp & PK issues

От
"David G. Johnston"
Дата:
On Thu, Jul 15, 2021 at 1:10 PM Wells Oliver <wells.oliver@gmail.com> wrote:

    primary key (guid, tid, seq)
) partition by range (extract(date from ts));

The PK is the PK, so I'm not sure what options I have there.

You are indirectly hitting the documented limitation that:

"Unique constraints (and hence primary keys) on partitioned tables must include all the partition key columns. This limitation exists because the individual indexes making up the constraint can only directly enforce uniqueness within their own partitions; therefore, the partition structure itself must guarantee that there are not duplicates in different partitions."


So if you want to enforce PKs here you need to both add the physical column to the table AND add the same to your primary key.

David J.

Re: Partitioning on the date part of a timestamp & PK issues

От
Ron
Дата:
On 7/15/21 3:10 PM, Wells Oliver wrote:
> I have a table like so:
>
> CREATE TABLE t (
>     guid uuid not null,
>     seq smallint not null,
>     tid smallint not null,
>     ts timestamp without time zone not null,
>     x real,
>     y real,
>     z real,
>     primary key (guid, tid, seq)
> ) partition by range (extract(date from ts));
>
> Which results in the error:
>
> ERROR:  unsupported PRIMARY KEY constraint with partition key definition
> DETAIL:  PRIMARY KEY constraints cannot be used when partition keys 
> include expressions.
>
> Is there a suitable way to get around this? I can make an additional date 
> col in the table to use for the partition range, but given the amount of 
> data we will have, I am trying to be cognizant of storage concerns.
>
> The PK is the PK, so I'm not sure what options I have there.

Why do you feel the need to "extract(date from ts)" instead of just 
partitioning by "ts"?

-- 
Angular momentum makes the world go 'round.



Re: Partitioning on the date part of a timestamp & PK issues

От
Wells Oliver
Дата:
Correct me if I'm wrong, but wouldn't partitioning by just the ts column create a ton of partitions per the granular nature of timestamps? Or do I wildly misunderstand this? Another Very Large Table we have partitioned by a date field cleanly creates monthly partitions.

On Thu, Jul 15, 2021 at 3:06 PM Ron <ronljohnsonjr@gmail.com> wrote:
On 7/15/21 3:10 PM, Wells Oliver wrote:
> I have a table like so:
>
> CREATE TABLE t (
>     guid uuid not null,
>     seq smallint not null,
>     tid smallint not null,
>     ts timestamp without time zone not null,
>     x real,
>     y real,
>     z real,
>     primary key (guid, tid, seq)
> ) partition by range (extract(date from ts));
>
> Which results in the error:
>
> ERROR:  unsupported PRIMARY KEY constraint with partition key definition
> DETAIL:  PRIMARY KEY constraints cannot be used when partition keys
> include expressions.
>
> Is there a suitable way to get around this? I can make an additional date
> col in the table to use for the partition range, but given the amount of
> data we will have, I am trying to be cognizant of storage concerns.
>
> The PK is the PK, so I'm not sure what options I have there.

Why do you feel the need to "extract(date from ts)" instead of just
partitioning by "ts"?

--
Angular momentum makes the world go 'round.




--

Re: Partitioning on the date part of a timestamp & PK issues

От
Alvaro Herrera
Дата:
On 2021-Jul-15, Wells Oliver wrote:

> Correct me if I'm wrong, but wouldn't partitioning by just the ts column
> create a ton of partitions per the granular nature of timestamps? Or do I
> wildly misunderstand this? Another Very Large Table we have partitioned by
> a date field cleanly creates monthly partitions.

You decide the boundaries values of each partition.  Dates are granular
to days, yet you have created partitions cleanly on month boundaries.
Same with timestamps.  Nobody is forcing you to create one partition per
microsecond.

Anyway, I don't think what you want is possible.  Your partition needs
to be formed by columns that are in the primary key, and your primary
key columns are already defined.  You can partition by guid, or
partition by tid, or partition by seq, or any combination thereof.  You
cannot partition by the date column.  The error about expressions being
unusable is just a forerunner.

-- 
Álvaro Herrera           39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/



Re: Partitioning on the date part of a timestamp & PK issues

От
Ron
Дата:
Use RANGE partitioning.  https://www.postgresql.org/docs/12/ddl-partitioning.html

CREATE TABLE t (
    guid uuid not null,
    seq smallint not null,
    tid smallint not null,
    ts timestamp without time zone not null,
    x real,
    y real,
    z real,
    primary key (guid, tid, seq, ts)
)   partition by range (ts);

CREATE TABLE t_y2021m07 PARTITION OF measurement
    FOR VALUES FROM ('2021-07-01') TO ('2021-08-01');

CREATE TABLE measurement_y2021m08 PARTITION OF measurement
    FOR VALUES FROM ('2021-08-01') TO ('2021-09-01');

On 7/15/21 5:09 PM, Wells Oliver wrote:
Correct me if I'm wrong, but wouldn't partitioning by just the ts column create a ton of partitions per the granular nature of timestamps? Or do I wildly misunderstand this? Another Very Large Table we have partitioned by a date field cleanly creates monthly partitions.

On Thu, Jul 15, 2021 at 3:06 PM Ron <ronljohnsonjr@gmail.com> wrote:
On 7/15/21 3:10 PM, Wells Oliver wrote:
> I have a table like so:
>
> CREATE TABLE t (
>     guid uuid not null,
>     seq smallint not null,
>     tid smallint not null,
>     ts timestamp without time zone not null,
>     x real,
>     y real,
>     z real,
>     primary key (guid, tid, seq)
> ) partition by range (extract(date from ts));
>
> Which results in the error:
>
> ERROR:  unsupported PRIMARY KEY constraint with partition key definition
> DETAIL:  PRIMARY KEY constraints cannot be used when partition keys
> include expressions.
>
> Is there a suitable way to get around this? I can make an additional date
> col in the table to use for the partition range, but given the amount of
> data we will have, I am trying to be cognizant of storage concerns.
>
> The PK is the PK, so I'm not sure what options I have there.

Why do you feel the need to "extract(date from ts)" instead of just
partitioning by "ts"?

--
Angular momentum makes the world go 'round.




--

--
Angular momentum makes the world go 'round.

Re: Partitioning on the date part of a timestamp & PK issues

От
Wells Oliver
Дата:
Thanks all, much clearer now.

On Thu, Jul 15, 2021 at 3:52 PM Ron <ronljohnsonjr@gmail.com> wrote:
Use RANGE partitioning.  https://www.postgresql.org/docs/12/ddl-partitioning.html

CREATE TABLE t (
    guid uuid not null,
    seq smallint not null,
    tid smallint not null,
    ts timestamp without time zone not null,
    x real,
    y real,
    z real,
    primary key (guid, tid, seq, ts)
)   partition by range (ts);

CREATE TABLE t_y2021m07 PARTITION OF measurement
    FOR VALUES FROM ('2021-07-01') TO ('2021-08-01');

CREATE TABLE measurement_y2021m08 PARTITION OF measurement
    FOR VALUES FROM ('2021-08-01') TO ('2021-09-01');

On 7/15/21 5:09 PM, Wells Oliver wrote:
Correct me if I'm wrong, but wouldn't partitioning by just the ts column create a ton of partitions per the granular nature of timestamps? Or do I wildly misunderstand this? Another Very Large Table we have partitioned by a date field cleanly creates monthly partitions.

On Thu, Jul 15, 2021 at 3:06 PM Ron <ronljohnsonjr@gmail.com> wrote:
On 7/15/21 3:10 PM, Wells Oliver wrote:
> I have a table like so:
>
> CREATE TABLE t (
>     guid uuid not null,
>     seq smallint not null,
>     tid smallint not null,
>     ts timestamp without time zone not null,
>     x real,
>     y real,
>     z real,
>     primary key (guid, tid, seq)
> ) partition by range (extract(date from ts));
>
> Which results in the error:
>
> ERROR:  unsupported PRIMARY KEY constraint with partition key definition
> DETAIL:  PRIMARY KEY constraints cannot be used when partition keys
> include expressions.
>
> Is there a suitable way to get around this? I can make an additional date
> col in the table to use for the partition range, but given the amount of
> data we will have, I am trying to be cognizant of storage concerns.
>
> The PK is the PK, so I'm not sure what options I have there.

Why do you feel the need to "extract(date from ts)" instead of just
partitioning by "ts"?

--
Angular momentum makes the world go 'round.




--

--
Angular momentum makes the world go 'round.


--

Re: Partitioning on the date part of a timestamp & PK issues

От
Alvaro Herrera
Дата:
On 2021-Jul-15, Ron wrote:

> CREATE TABLE t (
>     guid uuid not null,
>     seq smallint not null,
>     tid smallint not null,
>     ts timestamp without time zone not null,
>     x real,
>     y real,
>     z real,
>     primary key (guid, tid, seq, ts)
> )   partition by range (ts);

This definition will let you have multiple rows for the same
(guid,tid,seq) combination, as long as they have different ts values.
That is at odds with the original formulation, in which there could be a
single row with any given combination of those three columns.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Having your biases confirmed independently is how scientific progress is
made, and hence made our great society what it is today" (Mary Gardiner)



Re: Partitioning on the date part of a timestamp & PK issues

От
Ron
Дата:
On 7/15/21 9:53 PM, Alvaro Herrera wrote:
> On 2021-Jul-15, Ron wrote:
>
>> CREATE TABLE t (
>>      guid uuid not null,
>>      seq smallint not null,
>>      tid smallint not null,
>>      ts timestamp without time zone not null,
>>      x real,
>>      y real,
>>      z real,
>>      primary key (guid, tid, seq, ts)
>> )   partition by range (ts);
> This definition will let you have multiple rows for the same
> (guid,tid,seq) combination, as long as they have different ts values.
> That is at odds with the original formulation, in which there could be a
> single row with any given combination of those three columns.

That's the cost of partitioning by something other than uid, tid or seq.

-- 
Angular momentum makes the world go 'round.