Обсуждение: Proposal for UPDATE: do not insert new tuple on heap if update does not change data

Поиск
Список
Период
Сортировка

Proposal for UPDATE: do not insert new tuple on heap if update does not change data

От
Gasper Zejn
Дата:
Hi,

I was wondering if PostgreSQL adds new tuple if data is not changed
when using UPDATE. It turns out it does add them and I think it might
be beneficial not to add a new tuple in this case, since it causes a
great deal of maintenance: updating indexes, vacuuming table and
index, also heap fragmentation.

How to check:

CREATE TABLE foo (pk serial primary key, val text);
-- Starting point: two rows.
INSERT INTO foo VALUES (1, 'first');
INSERT INTO foo VALUES (2, 'second');
CHECKPOINT;

-- Updating row with same value.
UPDATE foo SET val = 'second' where pk = 2;
CHECKPOINT;

-- "Upsert" is the same.
INSERT INTO foo VALUES (2, 'second') ON CONFLICT (pk) DO UPDATE SET
val = 'second';
CHECKPOINT;

If after any checkpoint you look at page data, you can see multiple
versions of same row with "second".

Unfortunately, I don't believe I can come up with a patch on my own,
but will happily offer any further help with testing and ideas.


Attached is a script with minimal test case.

Kind regards,
Gasper Zejn

Вложения

Re: Proposal for UPDATE: do not insert new tuple on heap if update does not change data

От
Konstantin Knizhnik
Дата:
Hi,<br /><br /> To eliminate creation of new tuple version in this case it is necessary to check that update actually
doesn'tchange the record. <br /> It is not a cheapest test and it seems to be not so good idea to perform it always.<br
/>But if you fill that in your case there are many "identical" updates, you can always explicitly rewrite query by
addingextra check:<br /><pre wrap="">UPDATE foo SET val = 'second' where pk = 2 and val <> 'second';</pre><br
/><br/><br /><div class="moz-cite-prefix">On 20.01.2016 12:55, Gasper Zejn wrote:<br /></div><blockquote
cite="mid:CAMxXOOEZ4p0J6nke76XSjs1RfPR0mcb0Ddh-vbmAknjSPq-LTw@mail.gmail.com"type="cite"><pre wrap="">Hi,
 

I was wondering if PostgreSQL adds new tuple if data is not changed
when using UPDATE. It turns out it does add them and I think it might
be beneficial not to add a new tuple in this case, since it causes a
great deal of maintenance: updating indexes, vacuuming table and
index, also heap fragmentation.

How to check:

CREATE TABLE foo (pk serial primary key, val text);
-- Starting point: two rows.
INSERT INTO foo VALUES (1, 'first');
INSERT INTO foo VALUES (2, 'second');
CHECKPOINT;

-- Updating row with same value.
UPDATE foo SET val = 'second' where pk = 2;
CHECKPOINT;

-- "Upsert" is the same.
INSERT INTO foo VALUES (2, 'second') ON CONFLICT (pk) DO UPDATE SET
val = 'second';
CHECKPOINT;

If after any checkpoint you look at page data, you can see multiple
versions of same row with "second".

Unfortunately, I don't believe I can come up with a patch on my own,
but will happily offer any further help with testing and ideas.


Attached is a script with minimal test case.

Kind regards,
Gasper Zejn
</pre><br /><fieldset class="mimeAttachmentHeader"></fieldset><br /><pre wrap="">
</pre></blockquote><br /><pre class="moz-signature" cols="72">-- 
Konstantin Knizhnik
Postgres Professional: <a class="moz-txt-link-freetext"
href="http://www.postgrespro.com">http://www.postgrespro.com</a>
The Russian Postgres Company </pre>

Re: Proposal for UPDATE: do not insert new tuple on heap if update does not change data

От
Tom Lane
Дата:
Gasper Zejn <zelo.zejn@gmail.com> writes:
> I was wondering if PostgreSQL adds new tuple if data is not changed
> when using UPDATE. It turns out it does add them and I think it might
> be beneficial not to add a new tuple in this case, since it causes a
> great deal of maintenance: updating indexes, vacuuming table and
> index, also heap fragmentation.

This has been discussed in the past, and the conclusion was that expending
cycles on every UPDATE to check for this case would be a net loss.  How
many real applications do no-op updates often enough that it's worth
optimizing for?
        regards, tom lane



Re: Proposal for UPDATE: do not insert new tuple on heap if update does not change data

От
Kevin Grittner
Дата:
On Wed, Jan 20, 2016 at 3:55 AM, Gasper Zejn <zelo.zejn@gmail.com> wrote:

> I was wondering if PostgreSQL adds new tuple if data is not changed
> when using UPDATE. It turns out it does add them and I think it might
> be beneficial not to add a new tuple in this case, since it causes a
> great deal of maintenance: updating indexes, vacuuming table and
> index, also heap fragmentation.

If you have one or more tables on which you routinely updated rows
to the values they already have, you might want to attach an update
trigger using the suppress_redundant_updates_trigger() function.

http://www.postgresql.org/docs/current/interactive/functions-trigger.html

A better solution, where possible, is to use the WHERE clause to
avoid the update attempt where the new values are not distinct from
the old ones.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company