Re: Writeable CTEs and side effects

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: Writeable CTEs and side effects
Дата
Msg-id 200910092032.n99KW9a19255@momjian.us
обсуждение исходный текст
Ответ на Writeable CTEs and side effects  (Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>)
Список pgsql-hackers
Added to TODO:
Allow INSERT/UPDATE/DELETE ... RETURNING in common table expressions
*  http://archives.postgresql.org/pgsql-hackers/2009-10/msg00472.php

---------------------------------------------------------------------------

Marko Tiikkaja wrote:
> I've made progress in implementing writeable CTEs (repo at
> git://git.postgresql.org/git/writeable_cte.git , branch actually_write)
> and I've hit a few corner-cases which have lead me to think that we
> should be handling DML inside CTEs a bit differently.  Before I go on
> implementing this, I'd like to hear your input.
> 
> 1)    WITH t AS
>     (UPDATE foo SET bar = bar+1 RETURNING *)
>     SELECT * FROM t LIMIT 1;
> 
> What's problematic here is that only 1 row is read from the CTE, meaning
> also that only one row is updated which, at least how I see it, is not
> what we want.  The CTE should only store one row and return that after
> it has completely processed the UPDATE statement.
> 
> 2)    WITH t1 AS
>     (UPDATE foo SET bar=bar+1 RETURNING *),
>     t2 AS
>     (UPDATE foo SET bar=bar+1 RETURNING *)
>     SELECT * FROM t1
>     UNION ALL
>     SELECT * FROM t2;
> 
> This is probably not the most common scenario, but is still very
> surprising if you for some reason happen to hit it.  Both of the updates
> actually have the same transaction ID and command ID, so the rows are
> first updated by t1, but when t2 is processed, it looks at the rows and
> thinks that it already updated them.
> 
> 3)    WITH t1 AS
>     (UPDATE foo SET bar=bar+1 RETURNING *),
>     t2 AS
>     (UPDATE baz SET bat=bat+1 RETURNING *)
>     VALUES (true);
> 
> This isn't probably the most common situation either, but I think it's
> worth looking at; the user wants to update two different tables, but
> ignore the RETURNING data completely.  On IRC, this has been requested
> multiple times.  Even if we wouldn't agree that this feature is useful,
> it pretty much follows the semantics of example #1.
> 
> 
> Trying to tackle all of these at once, I've come up with this kind of
> execution strategy:
> 
> Before starting the execution of the main plan tree, for every CTE which
> is a DML query, do the following:
> 
>    1) Get a new CID
>    2a) If there are no references to the CTE (example #3), run the DML
> query to the end but ignore the results of the RETURNING query,
>    or
>    2b) If there are references, run the DML query to the end but store
> either as many as rows as you need to to answer the outer query (example
> #1) or if we can't determine the number of rows we need (most cases,
> example #2) run the query and store all of its results.
> 
> Then, if required, get a new CID for the main execution tree and execute
> it using the data we now have inside the CTEs.  This way we can avoid
> storing useless rows in memory without unexpected behaviour and caveats.
> 
> 
> Regards,
> Marko Tiikkaja
> 
> 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: Concurrency testing
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: Triggers on columns