Обсуждение: [BUG?] strange behavior in ALTER TABLE ... RENAME TO on inherited columns
Is it an expected behavior? postgres=# CREATE TABLE t1 (a int, b int); CREATE TABLE postgres=# CREATE TABLE t2 (b int, c int); CREATE TABLE postgres=#CREATE TABLE t3 (d int) inherits (t1, t2); NOTICE: merging multiple inherited definitions of column "b" CREATETABLE The t3.d is inherited from t1 and t2. Its attinhcount is 2. postgres=# ALTER TABLE t1 RENAME b TO x; ALTER TABLE It alters name of the column 'b' in the t1 and its child tables ('t3'). postgres=# SELECT * FROM t1; a | x ---+--- (0 rows) postgres=# SELECT * FROM t2; ERROR: could not find inherited attribute "b" of relation "t3" Because t3.b is also inherited from the t2, but ALTER TABLE does not care about multiple inherited columns well. I think we should not allow to rename a column with attinhcount > 1. Any comments? Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
Re: [BUG?] strange behavior in ALTER TABLE ... RENAME TO on inherited columns
От
Alvaro Herrera
Дата:
KaiGai Kohei wrote: > postgres=# SELECT * FROM t2; > ERROR: could not find inherited attribute "b" of relation "t3" > > Because t3.b is also inherited from the t2, but ALTER TABLE does not > care about multiple inherited columns well. > > I think we should not allow to rename a column with attinhcount > 1. I think we should fix ALTER TABLE to cope with multiple inheritance. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
2009/11/4 Alvaro Herrera <alvherre@commandprompt.com>: > KaiGai Kohei wrote: > >> postgres=# SELECT * FROM t2; >> ERROR: could not find inherited attribute "b" of relation "t3" >> >> Because t3.b is also inherited from the t2, but ALTER TABLE does not >> care about multiple inherited columns well. >> >> I think we should not allow to rename a column with attinhcount > 1. > > I think we should fix ALTER TABLE to cope with multiple inheritance. > I'd be interested to see how this should work. Given KaiGai's example, how would that be resolved? That column would already be merged for the inheriting table, so would renaming it somehow unmerge it? Given an insertion into t3 would propagate to both column b's in table t1 and t2, and then renaming b in t1 wouldn't make sense. Or would renaming it be prevented due to dependants? Or should t3's inheritance of t1 and t2 implicitly bind t1's and t2's column b to one another so that one affects the other? (i.e. renaming column b in t1 would also rename column b in t2, or both would require renaming during the same transaction.) ...or something less confusing. :) Thom
Thom Brown <thombrown@gmail.com> writes: > 2009/11/4 Alvaro Herrera <alvherre@commandprompt.com>: >> KaiGai Kohei wrote: >>> I think we should not allow to rename a column with attinhcount > 1. >> I think we should fix ALTER TABLE to cope with multiple inheritance. > I'd be interested to see how this should work. Yeah. I don't think a "fix" is possible, because there is no non-astonishing way for it to behave. I think KaiGai is right that forbidding the rename is the best solution. regards, tom lane
Tom Lane wrote: > Thom Brown <thombrown@gmail.com> writes: >> 2009/11/4 Alvaro Herrera <alvherre@commandprompt.com>: >>> KaiGai Kohei wrote: >>>> I think we should not allow to rename a column with attinhcount > 1. > >>> I think we should fix ALTER TABLE to cope with multiple inheritance. > >> I'd be interested to see how this should work. > > Yeah. I don't think a "fix" is possible, because there is no > non-astonishing way for it to behave. I think KaiGai is right that > forbidding the rename is the best solution. The attached patch forbids rename when the attribute is inherited from multiple parents. postgres=# CREATE TABLE t1 (a int, b int); CREATE TABLE postgres=# CREATE TABLE t2 (b int, c int); CREATE TABLE postgres=# CREATE TABLE t3 (d int) INHERITS (t1, t2); NOTICE: merging multiple inherited definitions of column "b" CREATE TABLE postgres=# SELECT * FROM t3; a | b | c | d ---+---+---+--- (0 rows) postgres=# ALTER TABLE t1 RENAME b TO x; ERROR: cannot rename multiple inherited column "b" The regression test detected a matter in the misc test. It tries to rename column "a" of "a_star" table, but it failed due to the new restriction. -- -- test the "star" operators a bit more thoroughly -- this time, -- throw in lots of NULL fields... -- -- a is the type root -- b and c inherit from a (one-level single inheritance) -- d inherits from b and c (two-level multiple inheritance) -- e inherits from c (two-level single inheritance) -- f inherits from e (three-level single inheritance) -- CREATE TABLE a_star ( class char, a int4 ); CREATE TABLE b_star ( b text ) INHERITS (a_star); CREATE TABLE c_star ( c name ) INHERITS (a_star); CREATE TABLE d_star ( d float8 ) INHERITS (b_star, c_star); At the misc test, --- 242,278 ---- ALTER TABLE c_star* RENAME COLUMN c TO cc; ALTER TABLE b_star* RENAME COLUMN b TO bb; ALTER TABLE a_star* RENAME COLUMN a TO aa; + ERROR: cannot rename multiple inherited column "a" SELECT class, aa FROM a_star* x WHERE aa ISNULL; ! ERROR: column "aa" does not exist ! LINE 1: SELECT class, aa ! It seems to me it is a case the regression test to be fixed up. (We don't have any reasonable way to know whether a certain attribute has a same source, or not.) Any comments? -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com> Index: src/test/regress/sql/inherit.sql =================================================================== *** src/test/regress/sql/inherit.sql (revision 2388) --- src/test/regress/sql/inherit.sql (working copy) *************** CREATE TABLE inh_error1 () INHERITS (t1, *** 336,338 **** --- 336,352 ---- CREATE TABLE inh_error2 (LIKE t4 INCLUDING STORAGE) INHERITS (t1); DROP TABLE t1, t2, t3, t4, t12_storage, t12_comments, t1_inh, t13_inh, t13_like, t_all; + + -- Test for renaming + CREATE TABLE t1 (a int, b int); + CREATE TABLE t2 (b int, c int); + CREATE TABLE t3 (d int) INHERITS(t1, t2); + ALTER TABLE t1 RENAME a TO x; + ALTER TABLE t1 RENAME b TO y; -- to be failed + ALTER TABLE t3 RENAME d TO z; + SELECT * FROM t3; + DROP TABLE t3; + DROP TABLE t2; + DROP TABLE t1; + + Index: src/test/regress/expected/inherit.out =================================================================== *** src/test/regress/expected/inherit.out (revision 2388) --- src/test/regress/expected/inherit.out (working copy) *************** NOTICE: merging column "a" with inherit *** 1057,1059 **** --- 1057,1076 ---- ERROR: column "a" has a storage parameter conflict DETAIL: MAIN versus EXTENDED DROP TABLE t1, t2, t3, t4, t12_storage, t12_comments, t1_inh, t13_inh, t13_like, t_all; + -- Test for renaming + CREATE TABLE t1 (a int, b int); + CREATE TABLE t2 (b int, c int); + CREATE TABLE t3 (d int) INHERITS(t1, t2); + NOTICE: merging multiple inherited definitions of column "b" + ALTER TABLE t1 RENAME a TO x; + ALTER TABLE t1 RENAME b TO y; -- to be failed + ERROR: cannot rename multiple inherited column "b" + ALTER TABLE t3 RENAME d TO z; + SELECT * FROM t3; + x | b | c | z + ---+---+---+--- + (0 rows) + + DROP TABLE t3; + DROP TABLE t2; + DROP TABLE t1; Index: src/backend/commands/tablecmds.c =================================================================== *** src/backend/commands/tablecmds.c (revision 2388) --- src/backend/commands/tablecmds.c (working copy) *************** renameatt(Oid myrelid, *** 2024,2029 **** --- 2024,2040 ---- errmsg("cannot rename inherited column \"%s\"", oldattname))); + /* + * If the attribute is inherited from multiple parents, forbid + * the renaming, because we don't have any reasonable way to keep + * integrity in whole of the inheritance relationship. + */ + if (attform->attinhcount > 1) + ereport(ERROR, + (errcode(ERRCODE_INVALID_TABLE_DEFINITION), + (errmsg("cannot rename multiple inherited column \"%s\"", + oldattname)))); + /* should not already exist */ /* this test is deliberately not attisdropped-aware */ if (SearchSysCacheExists(ATTNAME,
It is a patch for the matter which I reported before. When a column is inherited from multiple relations, ALTER TABLE with RENAME TO option is problematic. This patch fixes the matter. In correctly, it prevent to rename columns inherited from multiple relations and merged. Also see the past discussion: http://archives.postgresql.org/pgsql-hackers/2009-11/msg00138.php postgres=# CREATE TABLE t1 (a int, b int); CREATE TABLE postgres=# CREATE TABLE t2 (b int, c int); CREATE TABLE postgres=# CREATE TABLE t3 (x text) inherits (t1, t2); NOTICE: merging multiple inherited definitions of column "b" CREATE TABLE postgres=# SELECT * FROM t3; a | b | c | x ---+---+---+--- (0 rows) It looks to me fine. postgres=# ALTER TABLE t1 RENAME b TO y; ALTER TABLE postgres=# SELECT * FROM t3; a | y | c | x ---+---+---+--- (0 rows) postgres=# SELECT * FROM t1; a | y ---+--- (0 rows) It looks to me fine. postgres=# SELECT * FROM t2; ERROR: could not find inherited attribute "b" of relation "t3" Oops, when we refer the t3 via t2, it expects the inherited relation also has the column "b", but it was already renamed. One trouble is regression test. The misc_create test create a_star table, then it is inherited by b_star and c_star, then these are inherited to d_star table. Then misc test rename the a_star.a, but this patch prevent it. It looks like works well, but it is a corner case, because d_star.a is eventually inherited from a_star via b_star and c_star, and these are all the inherited relations. In generally, we don't have reasonable way to rename all the related columns upper and lower of the inheritance relationship. Thanks, (2009/11/05 9:57), KaiGai Kohei wrote: > Tom Lane wrote: >> Thom Brown<thombrown@gmail.com> writes: >>> 2009/11/4 Alvaro Herrera<alvherre@commandprompt.com>: >>>> KaiGai Kohei wrote: >>>>> I think we should not allow to rename a column with attinhcount> 1. >> >>>> I think we should fix ALTER TABLE to cope with multiple inheritance. >> >>> I'd be interested to see how this should work. >> >> Yeah. I don't think a "fix" is possible, because there is no >> non-astonishing way for it to behave. I think KaiGai is right that >> forbidding the rename is the best solution. > > The attached patch forbids rename when the attribute is inherited > from multiple parents. > > postgres=# CREATE TABLE t1 (a int, b int); > CREATE TABLE > postgres=# CREATE TABLE t2 (b int, c int); > CREATE TABLE > postgres=# CREATE TABLE t3 (d int) INHERITS (t1, t2); > NOTICE: merging multiple inherited definitions of column "b" > CREATE TABLE > postgres=# SELECT * FROM t3; > a | b | c | d > ---+---+---+--- > (0 rows) > > postgres=# ALTER TABLE t1 RENAME b TO x; > ERROR: cannot rename multiple inherited column "b" > > > The regression test detected a matter in the misc test. > > It tries to rename column "a" of "a_star" table, but it failed due to > the new restriction. > > -- > -- test the "star" operators a bit more thoroughly -- this time, > -- throw in lots of NULL fields... > -- > -- a is the type root > -- b and c inherit from a (one-level single inheritance) > -- d inherits from b and c (two-level multiple inheritance) > -- e inherits from c (two-level single inheritance) > -- f inherits from e (three-level single inheritance) > -- > CREATE TABLE a_star ( > class char, > a int4 > ); > > CREATE TABLE b_star ( > b text > ) INHERITS (a_star); > > CREATE TABLE c_star ( > c name > ) INHERITS (a_star); > > CREATE TABLE d_star ( > d float8 > ) INHERITS (b_star, c_star); > > At the misc test, > > --- 242,278 ---- > ALTER TABLE c_star* RENAME COLUMN c TO cc; > ALTER TABLE b_star* RENAME COLUMN b TO bb; > ALTER TABLE a_star* RENAME COLUMN a TO aa; > + ERROR: cannot rename multiple inherited column "a" > SELECT class, aa > FROM a_star* x > WHERE aa ISNULL; > ! ERROR: column "aa" does not exist > ! LINE 1: SELECT class, aa > ! > > It seems to me it is a case the regression test to be fixed up. > (We don't have any reasonable way to know whether a certain attribute > has a same source, or not.) > > Any comments? > > > > -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
Вложения
2009/12/16 KaiGai Kohei <kaigai@ak.jp.nec.com>: > It is a patch for the matter which I reported before. > > When a column is inherited from multiple relations, ALTER TABLE with > RENAME TO option is problematic. > This patch fixes the matter. In correctly, it prevent to rename columns > inherited from multiple relations and merged. No longer applies. Can you rebase? Thanks, ...Robert
(2009/12/30 10:38), Robert Haas wrote: > 2009/12/16 KaiGai Kohei<kaigai@ak.jp.nec.com>: >> It is a patch for the matter which I reported before. >> >> When a column is inherited from multiple relations, ALTER TABLE with >> RENAME TO option is problematic. >> This patch fixes the matter. In correctly, it prevent to rename columns >> inherited from multiple relations and merged. > > No longer applies. Can you rebase? The attached patch is the rebased revision. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
Вложения
KaiGai Kohei <kaigai@kaigai.gr.jp> writes: > (2009/12/30 10:38), Robert Haas wrote: >> No longer applies. Can you rebase? > The attached patch is the rebased revision. I'm not really impressed with this patch, because it will reject perfectly legitimate multiple-inheritance cases (ie, cases where there's more than one inheritance path from the same parent). This works fine at the moment: regression=# create table p1(f1 int, f2 int); CREATE TABLE regression=# create table c1(f3 int) inherits (p1); CREATE TABLE regression=# create table c2(f4 int) inherits (p1); CREATE TABLE regression=# create table cc(f5 int) inherits (c1,c2); NOTICE: merging multiple inherited definitions of column "f1" NOTICE: merging multiple inherited definitions of column "f2" CREATE TABLE regression=# \d cc Table "public.cc"Column | Type | Modifiers --------+---------+-----------f1 | integer | f2 | integer | f3 | integer | f4 | integer | f5 | integer| Inherits: c1, c2 regression=# alter table p1 rename f2 to ff2; ALTER TABLE regression=# \d cc Table "public.cc"Column | Type | Modifiers --------+---------+-----------f1 | integer | ff2 | integer | f3 | integer | f4 | integer | f5 | integer| Inherits: c1, c2 I don't think that protecting against cases where things won't work is an adequate reason for breaking cases that do work. regards, tom lane
On Sat, Jan 2, 2010 at 2:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > KaiGai Kohei <kaigai@kaigai.gr.jp> writes: >> (2009/12/30 10:38), Robert Haas wrote: >>> No longer applies. Can you rebase? > >> The attached patch is the rebased revision. > > I'm not really impressed with this patch, because it will reject > perfectly legitimate multiple-inheritance cases (ie, cases where there's > more than one inheritance path from the same parent). This works fine > at the moment: [...] > I don't think that protecting against cases where things won't work > is an adequate reason for breaking cases that do work. Upthread you appeared to be endorsing what KaiGai has implemented here: http://archives.postgresql.org/pgsql-hackers/2009-11/msg00147.php Rereading this a few times, perhaps you meant that we should prohibit renaming an ancestor when one of its descendents has a second and distinct ancestor, but the email you actually sent reads as if you were endorsing a blanket prohibition when attinhcount > 1. Can you clarify? Thanks, ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > Upthread you appeared to be endorsing what KaiGai has implemented here: > http://archives.postgresql.org/pgsql-hackers/2009-11/msg00147.php No, I said that forbidding conflicting renames would be a good solution. I did not endorse any specific means of testing for such a conflict. The one in this patch is not good enough to avoid breaking cases that actually are perfectly OK. It would be possible to detect the problematic cases correctly by first descending the inheritance tree and counting the number of arrivals at different children, and then doing it again and complaining if attinhcount was different from the count obtained the first time. This could probably be done by modifying find_all_inheritors to count duplicate occurrences rather than just discarding them. Whether it's worth it is not clear. In practice the reasonable engineering alternatives may just be to do what KaiGai's patch does, or to do nothing. In that case I think a good argument can be made for the latter. Nobody has ever complained about this from the field AFAIR; but we might get complaints if we disable cases that used to work fine. regards, tom lane
On Jan 3, 2010, at 12:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > In practice the reasonable engineering alternatives may just be to do > what KaiGai's patch does, or to do nothing. In that case I think a > good > argument can be made for the latter. Nobody has ever complained about > this from the field AFAIR; but we might get complaints if we disable > cases that used to work fine. Maybe. The current behavior of allowing the rename but then breaking queries certainly isn't awesome. I think if someone is willing to implement a more careful check we should accept it. ...Robert
(2010/01/04 4:06), Robert Haas wrote: > On Jan 3, 2010, at 12:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> In practice the reasonable engineering alternatives may just be to do >> what KaiGai's patch does, or to do nothing. In that case I think a good >> argument can be made for the latter. Nobody has ever complained about >> this from the field AFAIR; but we might get complaints if we disable >> cases that used to work fine. > > Maybe. The current behavior of allowing the rename but then breaking > queries certainly isn't awesome. I think if someone is willing to > implement a more careful check we should accept it. The condition to prevent problematic renaming might be modified to handle diamond inheritances correctly. The current patch raises an error when pg_attribute.inhcount > 1. But, in actually, it should raise an error when the number of origins of the attribute to be renamed is larger than 1. It shall be match with the inhcount unless it does not have diamond inheritances. We can easily check the number of origins with walking on the pg_inherits catalog. So, it seems to me the condition should be rewritten like: BEFORE: if (attform->attinhcount > 1) ereport(ERROR, ...); AFTER: if (number_of_attribute_origin(myrelid, oldattname) > 1) ereport(ERROR, ...); Am I missing something? Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
2010/1/3 KaiGai Kohei <kaigai@ak.jp.nec.com>: > (2010/01/04 4:06), Robert Haas wrote: >> On Jan 3, 2010, at 12:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> In practice the reasonable engineering alternatives may just be to do >>> what KaiGai's patch does, or to do nothing. In that case I think a good >>> argument can be made for the latter. Nobody has ever complained about >>> this from the field AFAIR; but we might get complaints if we disable >>> cases that used to work fine. >> >> Maybe. The current behavior of allowing the rename but then breaking >> queries certainly isn't awesome. I think if someone is willing to >> implement a more careful check we should accept it. > > The condition to prevent problematic renaming might be modified to handle > diamond inheritances correctly. > > The current patch raises an error when pg_attribute.inhcount > 1. > But, in actually, it should raise an error when the number of origins > of the attribute to be renamed is larger than 1. > It shall be match with the inhcount unless it does not have diamond > inheritances. > > We can easily check the number of origins with walking on the pg_inherits > catalog. So, it seems to me the condition should be rewritten like: > > BEFORE: > if (attform->attinhcount > 1) > ereport(ERROR, ...); > > AFTER: > if (number_of_attribute_origin(myrelid, oldattname) > 1) > ereport(ERROR, ...); > > Am I missing something? That sounds about right to me, though that function doesn't exist yet. :-) ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > 2010/1/3 KaiGai Kohei <kaigai@ak.jp.nec.com>: >> �if (number_of_attribute_origin(myrelid, oldattname) > 1) >> � � �ereport(ERROR, ...); >> >> Am I missing something? > That sounds about right to me, It looks remarkably inefficient to me. Do you propose to search the entire database's inheritance tree to derive that number? And do it over again at each child table? The method I suggested would allow the necessary information to be extracted during the initial search for child tables, which we have to do anyway. regards, tom lane
(2010/01/04 13:18), Tom Lane wrote: > Robert Haas<robertmhaas@gmail.com> writes: >> 2010/1/3 KaiGai Kohei<kaigai@ak.jp.nec.com>: >>> �if (number_of_attribute_origin(myrelid, oldattname)> 1) >>> � � �ereport(ERROR, ...); >>> >>> Am I missing something? > >> That sounds about right to me, > > It looks remarkably inefficient to me. Do you propose to search the > entire database's inheritance tree to derive that number? And do it > over again at each child table? Yes, > The method I suggested would allow the > necessary information to be extracted during the initial search for > child tables, which we have to do anyway. find_all_inheritors() returns a clean list which does not contain duplicated OID of the inherited relation, so it seems to me we need to change the function prototype but it affects other parts, or to add a new function which also returns number of duplications, not only OIDs. Or, we can call find_inheritance_children() in renameatt() as if find_all_inheritors() doing except for list_concat_unique_oid(). What is the most preferable? I don't have any preference in the way to fix the problem. Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
>> The method I suggested would allow the >> necessary information to be extracted during the initial search for >> child tables, which we have to do anyway. > > find_all_inheritors() returns a clean list which does not contain > duplicated OID of the inherited relation, so it seems to me we need > to change the function prototype but it affects other parts, or to add > a new function which also returns number of duplications, not only OIDs. > > Or, we can call find_inheritance_children() in renameatt() as if > find_all_inheritors() doing except for list_concat_unique_oid(). The attached patch modified the condition to prevent renaming. It computes an expected inhcount for each child relations on the initial call of the renameatt() for the parent relation. The find_all_inheritors_with_inhcount() returns OID of the inherited relations and the expected inhcoundt. If a child relation has diamond inheritance tree, it has its expected inhcount larger than 1. This patch raises an error, if pg_attribute.inhcount is larger than the expected inhcount. It can be happen when the attribute to be renamed is merged from any other unrelated relations in the child relations. See the example: postgres=# CREATE TABLE t1 (a int); CREATE TABLE postgres=# CREATE TABLE t2 (b int) inherits (t1); CREATE TABLE postgres=# CREATE TABLE t3 (c int) inherits (t1); CREATE TABLE postgres=# CREATE TABLE t4 (d int) inherits (t2, t3); NOTICE: merging multiple inherited definitions of column "a" CREATE TABLE postgres=# ALTER TABLE t1 RENAME a TO x; ALTER TABLE postgres=# \d t4 Table "public.t4" Column | Type | Modifiers --------+---------+----------- x | integer | b | integer | c | integer | d | integer | Inherits: t2, t3 We can rename a of t1, t2, t3 and t4 correctly, although t4.a has inherited from multiple relations. postgres=# CREATE TABLE s1 (x int); CREATE TABLE postgres=# CREATE TABLE t5 (e int) inherits (t2, t3, s1); NOTICE: merging multiple inherited definitions of column "x" NOTICE: merging multiple inherited definitions of column "x" CREATE TABLE postgres=# ALTER TABLE t1 RENAME x TO y; ERROR: cannot rename multiple inherited column "x" But, the new t5 prevent to rename "x" to "y", because t5.x is also inherited from the s1 and merged. So, its inhcount is 3 larger than expected inhcount (=2). postgres=# SELECT attname, attinhcount FROM pg_attribute where attname='x' and attrelid='t5'::regclass; attname | attinhcount ---------+------------- x | 3 (1 row) Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
Вложения
On Sun, Jan 3, 2010 at 11:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> 2010/1/3 KaiGai Kohei <kaigai@ak.jp.nec.com>: >>> if (number_of_attribute_origin(myrelid, oldattname) > 1) >>> ereport(ERROR, ...); >>> >>> Am I missing something? > >> That sounds about right to me, > > It looks remarkably inefficient to me. Do you propose to search the > entire database's inheritance tree to derive that number? And do it > over again at each child table? The method I suggested would allow the > necessary information to be extracted during the initial search for > child tables, which we have to do anyway. I haven't read the code in enough detail to have an educated opinion about whether that would induce enough overhead to be worth worrying about it, so I will refrain from comment on this until I have done my homework. ...Robert
The similar matter can be reproduced with ALTER TABLE ... TYPE statement, not only RENAME TO option. postgres=# CREATE TABLE t1 (a int); CREATE TABLE postgres=# CREATE TABLE s1 (a int); CREATE TABLE postgres=# CREATE TABLE ts (b int) inherits (t1, s1); NOTICE: merging multiple inherited definitions of column "a" CREATETABLE postgres=# ALTER TABLE t1 ALTER a TYPE text; ALTER TABLE postgres=# insert into t1 values ('aaa'); INSERT 0 1 postgres=# insert into ts values ('bbb', 2); INSERT 0 1 postgres=#SELECT * FROM t1; a ----- aaa bbb (2 rows) postgres=# SELECT * FROM s1; ERROR: attribute "a" of relation "ts" does not match parent's type In the renameatt(), we can count an expected inhcount of the column to be renamed on find_all_inheritors() at the top-level recursion. But it does not work well for the new one, because it is handled within the common ATPrepCmd() scheme. I reconsidered that we need a function to check whether the given column is inherited from multiple root parents, or not, for each levels. Then, it can be called from both of renameatt() and ATPrepAlterColumnType(). (2010/01/04 18:55), KaiGai Kohei wrote: >>> The method I suggested would allow the >>> necessary information to be extracted during the initial search for >>> child tables, which we have to do anyway. >> >> find_all_inheritors() returns a clean list which does not contain >> duplicated OID of the inherited relation, so it seems to me we need >> to change the function prototype but it affects other parts, or to add >> a new function which also returns number of duplications, not only OIDs. >> >> Or, we can call find_inheritance_children() in renameatt() as if >> find_all_inheritors() doing except for list_concat_unique_oid(). > > The attached patch modified the condition to prevent renaming. > > It computes an expected inhcount for each child relations on the initial > call of the renameatt() for the parent relation. > The find_all_inheritors_with_inhcount() returns OID of the inherited > relations and the expected inhcoundt. If a child relation has diamond > inheritance tree, it has its expected inhcount larger than 1. > > This patch raises an error, if pg_attribute.inhcount is larger than > the expected inhcount. It can be happen when the attribute to be > renamed is merged from any other unrelated relations in the child > relations. > > See the example: > > postgres=# CREATE TABLE t1 (a int); > CREATE TABLE > postgres=# CREATE TABLE t2 (b int) inherits (t1); > CREATE TABLE > postgres=# CREATE TABLE t3 (c int) inherits (t1); > CREATE TABLE > postgres=# CREATE TABLE t4 (d int) inherits (t2, t3); > NOTICE: merging multiple inherited definitions of column "a" > CREATE TABLE > > postgres=# ALTER TABLE t1 RENAME a TO x; > ALTER TABLE > postgres=# \d t4 > Table "public.t4" > Column | Type | Modifiers > --------+---------+----------- > x | integer | > b | integer | > c | integer | > d | integer | > Inherits: t2, > t3 > > We can rename a of t1, t2, t3 and t4 correctly, although t4.a has inherited from > multiple relations. > > postgres=# CREATE TABLE s1 (x int); > CREATE TABLE > postgres=# CREATE TABLE t5 (e int) inherits (t2, t3, s1); > NOTICE: merging multiple inherited definitions of column "x" > NOTICE: merging multiple inherited definitions of column "x" > CREATE TABLE > postgres=# ALTER TABLE t1 RENAME x TO y; > ERROR: cannot rename multiple inherited column "x" > > But, the new t5 prevent to rename "x" to "y", because t5.x is also inherited from > the s1 and merged. So, its inhcount is 3 larger than expected inhcount (=2). > > postgres=# SELECT attname, attinhcount FROM pg_attribute where attname='x' and attrelid='t5'::regclass; > attname | attinhcount > ---------+------------- > x | 3 > (1 row) > > Thanks, > > > > -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
The attached patch fixes bugs when we try to rename (and change type) on a column inherited from the different origin and merged. This patch adds: List *find_column_origin(Oid relOid, const char *colName) It returns the list of relation OIDs which originally defines the given column. In most cases, it returns a list with an element. But, if the column is inherited from multiple parent relations and merged during the inheritance tree, the returned list contains multiple OIDs. In this case, we have to forbid changing type and renaming to keep correctness of the table definition. Example) postgres=# CREATE TABLE t1 (a int, b int); CREATE TABLE postgres=# CREATE TABLE s1 (b int, c int); CREATE TABLE postgres=# CREATE TABLE ts (d int) INHERITS (t1, s1); NOTICE: merging multiple inherited definitions of column "b" CREATE TABLE postgres=# ALTER TABLE t1 ALTER a TYPE text; ALTER TABLE postgres=# ALTER TABLE t1 ALTER b TYPE text; ERROR: cannot alter multiple inherited column "b" <---- (*) (*) ts.b is also inherited from s1, not only t1, so forbid changing type postgres=# ALTER TABLE s1 RENAME c TO cc; ALTER TABLE postgres=# ALTER TABLE s1 RENAME b TO bb; ERROR: cannot rename multiple inherited column "b" <---- (*) (*) ts.b is also inherited from s1, not only t1, so forbid renaming postgres=# \d+ ts Table "public.ts" Column | Type | Modifiers | Storage | Description --------+---------+-----------+----------+------------- a | text | | extended | b | integer | | plain | cc | integer | | plain | d | integer | | plain | Inherits: t1, s1 Has OIDs: no In the case when a column is inherited from multiple relations but it eventually has same origin (diamond inheritance tree case), we don't need to forbid these operations. In this case, find_column_origin() does not return duplicated OIDs, all the caller has to do is to check whether length of the returned list is larger than 1, or no. Thanks, (2010/01/14 12:43), KaiGai Kohei wrote: > The similar matter can be reproduced with ALTER TABLE ... TYPE statement, > not only RENAME TO option. > > postgres=# CREATE TABLE t1 (a int); > CREATE TABLE > postgres=# CREATE TABLE s1 (a int); > CREATE TABLE > > postgres=# CREATE TABLE ts (b int) inherits (t1, s1); > NOTICE: merging multiple inherited definitions of column "a" > CREATE TABLE > postgres=# ALTER TABLE t1 ALTER a TYPE text; > ALTER TABLE > > postgres=# insert into t1 values ('aaa'); > INSERT 0 1 > postgres=# insert into ts values ('bbb', 2); > INSERT 0 1 > postgres=# SELECT * FROM t1; > a > ----- > aaa > bbb > (2 rows) > > postgres=# SELECT * FROM s1; > ERROR: attribute "a" of relation "ts" does not match parent's type > > In the renameatt(), we can count an expected inhcount of the column to be > renamed on find_all_inheritors() at the top-level recursion. > But it does not work well for the new one, because it is handled within > the common ATPrepCmd() scheme. > > I reconsidered that we need a function to check whether the given column > is inherited from multiple root parents, or not, for each levels. > Then, it can be called from both of renameatt() and ATPrepAlterColumnType(). > > > (2010/01/04 18:55), KaiGai Kohei wrote: >>>> The method I suggested would allow the >>>> necessary information to be extracted during the initial search for >>>> child tables, which we have to do anyway. >>> >>> find_all_inheritors() returns a clean list which does not contain >>> duplicated OID of the inherited relation, so it seems to me we need >>> to change the function prototype but it affects other parts, or to add >>> a new function which also returns number of duplications, not only OIDs. >>> >>> Or, we can call find_inheritance_children() in renameatt() as if >>> find_all_inheritors() doing except for list_concat_unique_oid(). >> >> The attached patch modified the condition to prevent renaming. >> >> It computes an expected inhcount for each child relations on the initial >> call of the renameatt() for the parent relation. >> The find_all_inheritors_with_inhcount() returns OID of the inherited >> relations and the expected inhcoundt. If a child relation has diamond >> inheritance tree, it has its expected inhcount larger than 1. >> >> This patch raises an error, if pg_attribute.inhcount is larger than >> the expected inhcount. It can be happen when the attribute to be >> renamed is merged from any other unrelated relations in the child >> relations. >> >> See the example: >> >> postgres=# CREATE TABLE t1 (a int); >> CREATE TABLE >> postgres=# CREATE TABLE t2 (b int) inherits (t1); >> CREATE TABLE >> postgres=# CREATE TABLE t3 (c int) inherits (t1); >> CREATE TABLE >> postgres=# CREATE TABLE t4 (d int) inherits (t2, t3); >> NOTICE: merging multiple inherited definitions of column "a" >> CREATE TABLE >> >> postgres=# ALTER TABLE t1 RENAME a TO x; >> ALTER TABLE >> postgres=# \d t4 >> Table "public.t4" >> Column | Type | Modifiers >> --------+---------+----------- >> x | integer | >> b | integer | >> c | integer | >> d | integer | >> Inherits: t2, >> t3 >> >> We can rename a of t1, t2, t3 and t4 correctly, although t4.a has inherited from >> multiple relations. >> >> postgres=# CREATE TABLE s1 (x int); >> CREATE TABLE >> postgres=# CREATE TABLE t5 (e int) inherits (t2, t3, s1); >> NOTICE: merging multiple inherited definitions of column "x" >> NOTICE: merging multiple inherited definitions of column "x" >> CREATE TABLE >> postgres=# ALTER TABLE t1 RENAME x TO y; >> ERROR: cannot rename multiple inherited column "x" >> >> But, the new t5 prevent to rename "x" to "y", because t5.x is also inherited from >> the s1 and merged. So, its inhcount is 3 larger than expected inhcount (=2). >> >> postgres=# SELECT attname, attinhcount FROM pg_attribute where attname='x' and attrelid='t5'::regclass; >> attname | attinhcount >> ---------+------------- >> x | 3 >> (1 row) >> >> Thanks, >> >> >> >> > > -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
Вложения
--On 14. Januar 2010 16:04:17 +0900 KaiGai Kohei <kaigai@ak.jp.nec.com> wrote: > This patch adds: > > List *find_column_origin(Oid relOid, const char *colName) > > It returns the list of relation OIDs which originally defines the given > column. In most cases, it returns a list with an element. But, if the > column is inherited from multiple parent relations and merged during the > inheritance tree, the returned list contains multiple OIDs. > In this case, we have to forbid changing type and renaming to keep > correctness of the table definition. Here's a slightly edited version of this patch from reviewing, fixing the following: * Fix a compiler warning by passing a pointer to skey to systable_beginscan() (it's an array already) * Edit some comments The patch works as expected (at least, i don't see any remaining issues). I'm going to mark this ready for committer. -- Thanks Bernd
Вложения
On Sat, Jan 23, 2010 at 1:45 PM, Bernd Helmle <mailings@oopsware.de> wrote: > --On 14. Januar 2010 16:04:17 +0900 KaiGai Kohei <kaigai@ak.jp.nec.com> > wrote: >> This patch adds: >> >> List *find_column_origin(Oid relOid, const char *colName) >> >> It returns the list of relation OIDs which originally defines the given >> column. In most cases, it returns a list with an element. But, if the >> column is inherited from multiple parent relations and merged during the >> inheritance tree, the returned list contains multiple OIDs. >> In this case, we have to forbid changing type and renaming to keep >> correctness of the table definition. > > Here's a slightly edited version of this patch from reviewing, fixing the > following: > > * Fix a compiler warning by passing a pointer to skey to > systable_beginscan() (it's an array already) > > * Edit some comments > > The patch works as expected (at least, i don't see any remaining issues). > I'm going to mark this ready for committer. I don't think this is ready for committer, becauseTom previously objected to the approach taken by this patch here, and no one has answered his objections: http://archives.postgresql.org/pgsql-hackers/2010-01/msg00144.php I think someone needs to figure out what the worst-case scenario for this is performance-wise and submit a reproducible test case with benchmark results. In the meantime, I'm going to set this to Waiting on Author. ...Robert
(2010/01/24 12:29), Robert Haas wrote: > On Sat, Jan 23, 2010 at 1:45 PM, Bernd Helmle<mailings@oopsware.de> wrote: >> --On 14. Januar 2010 16:04:17 +0900 KaiGai Kohei<kaigai@ak.jp.nec.com> >> wrote: >>> This patch adds: >>> >>> List *find_column_origin(Oid relOid, const char *colName) >>> >>> It returns the list of relation OIDs which originally defines the given >>> column. In most cases, it returns a list with an element. But, if the >>> column is inherited from multiple parent relations and merged during the >>> inheritance tree, the returned list contains multiple OIDs. >>> In this case, we have to forbid changing type and renaming to keep >>> correctness of the table definition. >> >> Here's a slightly edited version of this patch from reviewing, fixing the >> following: >> >> * Fix a compiler warning by passing a pointer to skey to >> systable_beginscan() (it's an array already) >> >> * Edit some comments >> >> The patch works as expected (at least, i don't see any remaining issues). >> I'm going to mark this ready for committer. > > I don't think this is ready for committer, becauseTom previously > objected to the approach taken by this patch here, and no one has > answered his objections: > > http://archives.postgresql.org/pgsql-hackers/2010-01/msg00144.php > > I think someone needs to figure out what the worst-case scenario for > this is performance-wise and submit a reproducible test case with > benchmark results. In the meantime, I'm going to set this to Waiting > on Author. Basically, I don't think it is not a performance focused command, because we may be able to assume ALTER TABLE RENAME TO is not much frequent operations. However, I'm interested in between two approaches. I'll check both of them. Isn't is necessary to compare the vanilla code base, because the matter is a bug to be fixed? http://archives.postgresql.org/message-id/4B41BB04.2070609@ak.jp.nec.com http://archives.postgresql.org/message-id/A7739F610FB0BD89E310D85E@[172.26.14.62] Please wait for weekday, because I don't have physical (not virtual) machine in my home. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
--On 23. Januar 2010 22:29:23 -0500 Robert Haas <robertmhaas@gmail.com> wrote: > I don't think this is ready for committer, becauseTom previously > objected to the approach taken by this patch here, and no one has > answered his objections: > > http://archives.postgresql.org/pgsql-hackers/2010-01/msg00144.php > Ugh, i thought KaiGai's revised patch here <http://archives.postgresql.org/message-id/4B41BB04.2070609@ak.jp.nec.com> was in response to Tom's complaint, since it modifies the method to identify column origins by recursivly scanning pg_inherits with its current inhrelid. > I think someone needs to figure out what the worst-case scenario for > this is performance-wise and submit a reproducible test case with > benchmark results. In the meantime, I'm going to set this to Waiting > on Author. Makes sense. -- Thanks Bernd
On Sat, Jan 23, 2010 at 10:48 PM, KaiGai Kohei <kaigai@kaigai.gr.jp> wrote: > (2010/01/24 12:29), Robert Haas wrote: >> I don't think this is ready for committer, becauseTom previously >> objected to the approach taken by this patch here, and no one has >> answered his objections: >> >> http://archives.postgresql.org/pgsql-hackers/2010-01/msg00144.php >> >> I think someone needs to figure out what the worst-case scenario for >> this is performance-wise and submit a reproducible test case with >> benchmark results. In the meantime, I'm going to set this to Waiting >> on Author. > > Basically, I don't think it is not a performance focused command, > because we may be able to assume ALTER TABLE RENAME TO is not much > frequent operations. I agree - the requirements here are much looser than for, say, SELECT or UPDATE. But it still has to not suck. I think the problem case here might be something like this... create ten tables A1 through A10. Now create 10 more tables B1 through B10 each of which inherits from all of A1 through A10. Now create 10 more tables C1 through C10 that inherit from B1 through B10. Now create 1000 tables D1 through D1000 that inherit from C1 through C10. Now drop a column from A1. ...Robert
On Sun, Jan 24, 2010 at 8:36 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Sat, Jan 23, 2010 at 10:48 PM, KaiGai Kohei <kaigai@kaigai.gr.jp> wrote: >> (2010/01/24 12:29), Robert Haas wrote: >>> I don't think this is ready for committer, becauseTom previously >>> objected to the approach taken by this patch here, and no one has >>> answered his objections: >>> >>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg00144.php >>> >>> I think someone needs to figure out what the worst-case scenario for >>> this is performance-wise and submit a reproducible test case with >>> benchmark results. In the meantime, I'm going to set this to Waiting >>> on Author. >> >> Basically, I don't think it is not a performance focused command, >> because we may be able to assume ALTER TABLE RENAME TO is not much >> frequent operations. > > I agree - the requirements here are much looser than for, say, SELECT > or UPDATE. But it still has to not suck. > > I think the problem case here might be something like this... create > ten tables A1 through A10. Now create 10 more tables B1 through B10 > each of which inherits from all of A1 through A10. Now create 10 more > tables C1 through C10 that inherit from B1 through B10. Now create > 1000 tables D1 through D1000 that inherit from C1 through C10. Now > drop a column from A1. Er... rename a column from A1, not drop. ...Robert
--On 24. Januar 2010 08:37:13 -0500 Robert Haas <robertmhaas@gmail.com> wrote: >> >> I agree - the requirements here are much looser than for, say, SELECT >> or UPDATE. But it still has to not suck. >> Yeah, i think the meaning of "suck" can be much weakier than for a DML command. However, if it would degrade the performance of a formerly well running command in a way, that it would be unusable, that would be annoying. >> I think the problem case here might be something like this... create >> ten tables A1 through A10. Now create 10 more tables B1 through B10 >> each of which inherits from all of A1 through A10. Now create 10 more >> tables C1 through C10 that inherit from B1 through B10. Now create >> 1000 tables D1 through D1000 that inherit from C1 through C10. Now >> drop a column from A1. > > Er... rename a column from A1, not drop. > Did that with a crude pl/pgsql script, and got the following numbers: Current -HEAD: Phenom-II 2.6 GHz: Time: 282,471 ms MacBook: Time: 499,866 ms With KaiGais recent patch (which covers the TYPE case, too): Phenom-II 2.6 GHz: Time: 476,800 ms MacBook: Time: 753,161 ms -- Thanks Bernd
Bernd Helmle <mailings@oopsware.de> writes: > --On 24. Januar 2010 08:37:13 -0500 Robert Haas <robertmhaas@gmail.com> > wrote: >> I think the problem case here might be something like this... > Did that with a crude pl/pgsql script, and got the following numbers: I think my concern about the original proposal was that the time to perform an ALTER RENAME would increase with the number of tables in the database, even if they were entirely unrelated to the one you're trying to rename. It's not clear to me whether the present coding of the patch avoids that. But in any case, the alternative implementation I suggested would have added essentially zero runtime, so even a 50% slowdown seems like sacrificing a lot. regards, tom lane
--On 24. Januar 2010 13:23:14 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: > I think my concern about the original proposal was that the time to > perform an ALTER RENAME would increase with the number of tables in the > database, even if they were entirely unrelated to the one you're trying > to rename. Uhm, find_column_origin() scans pg_inherits recursively by looking up inhparent from the given inhrelid. I don't see where this should be related to the number of tables not part of the inheritance tree (or inheritance at all). -- Thanks Bernd
--On 24. Januar 2010 19:45:33 +0100 Bernd Helmle <mailings@oopsware.de> wrote: > I don't see where this should be related to the number of tables not > part of the inheritance tree (or inheritance at all). To answer that myself: it seems get_attname() introduces the overhead here (forgot about that). Creating additional 16384 tables without any connection to the inheritance increases the times on my Phenom-II Box to round about 2 seconds: Current -HEAD bernd=# ALTER TABLE a1 RENAME COLUMN acol1 TO xyz; ALTER TABLE Time: 409,045 ms With KaiGai's recent patch: bernd=# ALTER TABLE a1 RENAME COLUMN acol1 TO xyz; ALTER TABLE Time: 2402,306 ms -- Thanks Bernd
On Sun, Jan 24, 2010 at 2:01 PM, Bernd Helmle <mailings@oopsware.de> wrote: > --On 24. Januar 2010 19:45:33 +0100 Bernd Helmle <mailings@oopsware.de> > wrote: >> I don't see where this should be related to the number of tables not >> part of the inheritance tree (or inheritance at all). > To answer that myself: it seems get_attname() introduces the overhead here > (forgot about that). Creating additional 16384 tables without any connection > to the inheritance increases the times on my Phenom-II Box to round about 2 > seconds: > > Current -HEAD > > bernd=# ALTER TABLE a1 RENAME COLUMN acol1 TO xyz; > ALTER TABLE > Time: 409,045 ms > > With KaiGai's recent patch: > > bernd=# ALTER TABLE a1 RENAME COLUMN acol1 TO xyz; > ALTER TABLE > Time: 2402,306 ms Hmm, so adding those tables slowed down HEAD by <50%, but the time with KaiGai's patch more than trippled. So I think we definitely need KaiGai to try it the other way and see how that shakes out... ...Robert
(2010/01/25 4:01), Bernd Helmle wrote: > > > --On 24. Januar 2010 19:45:33 +0100 Bernd Helmle <mailings@oopsware.de> > wrote: > >> I don't see where this should be related to the number of tables not >> part of the inheritance tree (or inheritance at all). > > To answer that myself: it seems get_attname() introduces the overhead > here (forgot about that). Creating additional 16384 tables without any > connection to the inheritance increases the times on my Phenom-II Box to > round about 2 seconds: > > > Current -HEAD > > bernd=# ALTER TABLE a1 RENAME COLUMN acol1 TO xyz; > ALTER TABLE > Time: 409,045 ms > > > With KaiGai's recent patch: > > bernd=# ALTER TABLE a1 RENAME COLUMN acol1 TO xyz; > ALTER TABLE > Time: 2402,306 ms Hmm.... Bernd, could you try same test with previous patch? http://archives.postgresql.org/message-id/4B41BB04.2070609@ak.jp.nec.com It computes an expected inhcount during find_all_inheritors(), and compares it with the target pg_attribute entry? I'll also try to measure performance in three cases by myself. Please wait for a while... Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
(2010/01/25 8:45), KaiGai Kohei wrote: > (2010/01/25 4:01), Bernd Helmle wrote: >> >> >> --On 24. Januar 2010 19:45:33 +0100 Bernd Helmle<mailings@oopsware.de> >> wrote: >> >>> I don't see where this should be related to the number of tables not >>> part of the inheritance tree (or inheritance at all). >> >> To answer that myself: it seems get_attname() introduces the overhead >> here (forgot about that). Creating additional 16384 tables without any >> connection to the inheritance increases the times on my Phenom-II Box to >> round about 2 seconds: >> >> >> Current -HEAD >> >> bernd=# ALTER TABLE a1 RENAME COLUMN acol1 TO xyz; >> ALTER TABLE >> Time: 409,045 ms >> >> >> With KaiGai's recent patch: >> >> bernd=# ALTER TABLE a1 RENAME COLUMN acol1 TO xyz; >> ALTER TABLE >> Time: 2402,306 ms > > Hmm.... > > Bernd, could you try same test with previous patch? > http://archives.postgresql.org/message-id/4B41BB04.2070609@ak.jp.nec.com > > It computes an expected inhcount during find_all_inheritors(), and > compares it with the target pg_attribute entry? > > I'll also try to measure performance in three cases by myself. > Please wait for a while... I set up 11,111 of tables inherited from a common table. (echo "CREATE TABLE t (a int);"for i in `seq 0 9`; do echo "CREATE TABLE s$i (b int) INHERITS(t);" for j in `seq 09`; do echo "CREATE TABLE v$i$j (c int) INHERITS(s$i);" for k in `seq 0 9`; do echo "CREATE TABLEw$i$j$k (d int) INHERITS(v$i$j);" for l in `seq 0 9`; do echo "CREATE TABLE x$i$j$k$l (e int)INHERITS(w$i$j$k);" done done donedone) | psql test And, I measured time to execute the following query using /usr/bin/time. ALTER TABLE t RENAME a TO aa; ALTER TABLE t RENAME aa TO aaa; ALTER TABLE t RENAME aaa TO aaaa; ALTER TABLE t RENAME aaaaTO aaaaa; ALTER TABLE t RENAME aaaaa TO a; The platform is Pentium4 (3.20GHz) and generic SATA drive. * CVS HEAD - does not care anything Avg: 25.840s (25.663s 24.214s 26.958s 26.525s) * My rev.3 patch - find_all_inheritors_with_inhcount() Avg: 26.488s (25.197s 27.847s 25.487s 27.421s) * My rev.4 patch - find_column_origin(), also fixes AT_AlterColumnType case Avg: 28.693s (27.936s 30.295s 29.385s 27.159s) It seems to me the result is different from Bernd's report. Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
2010/1/24 KaiGai Kohei <kaigai@ak.jp.nec.com>: > It seems to me the result is different from Bernd's report. Well, you tested something different, so you got a different answer. Your case doesn't have any multiple inheritance. ...Robert
(2010/01/25 14:08), Robert Haas wrote: > 2010/1/24 KaiGai Kohei<kaigai@ak.jp.nec.com>: >> It seems to me the result is different from Bernd's report. > > Well, you tested something different, so you got a different answer. > Your case doesn't have any multiple inheritance. If it tries ALTER TABLE xxx RENAME TO on any multiple inheritance column, this patch will raise an error when it founds the first column unable to rename. (Of course, it takes inconsistency in table definitions, so we need to prevent it.) It does not make sense in performance comparison. The issue is whether we need to check pg_inherits for each recursion level in renameatt(), or not. So, I checked the case when we try to rename the root of inheritance tree. Or, are you saying to test diamond-inheritance cases? Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
2010/1/25 KaiGai Kohei <kaigai@ak.jp.nec.com>: > Or, are you saying to test diamond-inheritance cases? Please go back and read the test case that I already proposed. ...Robert
--On 25. Januar 2010 11:39:21 +0900 KaiGai Kohei <kaigai@ak.jp.nec.com> wrote: > (echo "CREATE TABLE t (a int);" > for i in `seq 0 9`; do > echo "CREATE TABLE s$i (b int) INHERITS(t);" > for j in `seq 0 9`; do > echo "CREATE TABLE v$i$j (c int) INHERITS(s$i);" > for k in `seq 0 9`; do > echo "CREATE TABLE w$i$j$k (d int) INHERITS(v$i$j);" > for l in `seq 0 9`; do > echo "CREATE TABLE x$i$j$k$l (e int) INHERITS(w$i$j$k);" > done > done > done > done) | psql test Well, each table inherits one table in your test. In my test, I inherit from multiple tables for each table. My script generates the following inheritance tree (and wins a price of copy & paste ugliness, see attachment): A1, A2, A3, ..., Am B1 INHERITS(A1...A10), B2 INHERITS(A1...A10, B3 INHERITS(A1...A10), ...Bn C1 INHERITS(B1...B10), C2 INHERITS(B1...B10), ... Co D1 INHERITS(C1...C10), ..., Dp m = 10 n = 10 o = 10 p = 1000 Repeating this on my MacBook gives: ALTER TABLE a1 RENAME COLUMN acol1 TO xyz; -HEAD: Time: 382,427 ms Time: 375,974 ms Time: 385,478 ms Time: 371,067 ms Time: 410,834 ms Time: 386,382 ms Recent V4 patch: Time: 6065,673 ms Time: 3823,206 ms Time: 4037,933 ms Time: 3873,029 ms Time: 3899,607 ms Time: 3963,308 ms Note that you have to increase max_locks_per_transaction to run the script, i used pg_ctl -o '--checkpoint-segments=32 --max-locks-per-transaction=128' start -- Thanks Bernd
Вложения
(2010/01/26 1:11), Bernd Helmle wrote: > > > --On 25. Januar 2010 11:39:21 +0900 KaiGai Kohei <kaigai@ak.jp.nec.com> > wrote: > >> (echo "CREATE TABLE t (a int);" >> for i in `seq 0 9`; do >> echo "CREATE TABLE s$i (b int) INHERITS(t);" >> for j in `seq 0 9`; do >> echo "CREATE TABLE v$i$j (c int) INHERITS(s$i);" >> for k in `seq 0 9`; do >> echo "CREATE TABLE w$i$j$k (d int) INHERITS(v$i$j);" >> for l in `seq 0 9`; do >> echo "CREATE TABLE x$i$j$k$l (e int) INHERITS(w$i$j$k);" >> done >> done >> done >> done) | psql test > > Well, each table inherits one table in your test. In my test, I inherit > from multiple tables for each table. My script generates the following > inheritance tree (and wins a price of copy & paste ugliness, see > attachment): > > A1, A2, A3, ..., Am > B1 INHERITS(A1...A10), B2 INHERITS(A1...A10, B3 INHERITS(A1...A10), ...Bn > C1 INHERITS(B1...B10), C2 INHERITS(B1...B10), ... Co > D1 INHERITS(C1...C10), ..., Dp > > m = 10 > n = 10 > o = 10 > p = 1000 > > Repeating this on my MacBook gives: > > ALTER TABLE a1 RENAME COLUMN acol1 TO xyz; > > -HEAD: > > Time: 382,427 ms > Time: 375,974 ms > Time: 385,478 ms > Time: 371,067 ms > Time: 410,834 ms > Time: 386,382 ms > > Recent V4 patch: > > Time: 6065,673 ms > Time: 3823,206 ms > Time: 4037,933 ms > Time: 3873,029 ms > Time: 3899,607 ms > Time: 3963,308 ms Hmm... I also could observe similar result in 4 times iteration of ALTER TABLE with your test_rename.sql. I agree the recent V4 patch is poor in performance perspective. * CVS HEAD 0.828s 0.828s 0.833s 0.829s 0.838s * Rcent V4 patch:10.283s10.135s10.107s10.382s10.162s * Previous V3 patch: 2.607s 2.429s 2.431s 2.436s 2.428s The V3 patch is designed to compute an expected inhcount for each relations to be altered at first, then it shall be compared to pg_attribute.inhcount to be renamed. Basically, its execution cost is same order except for a case when a relation has diamond inheritance tree. The find_all_inheritors() does not check child relations which is already scanned. However, in this case, we have to check how many times is the child relation inherited from a common origin. I guess it is reason of the different between -HEAD and V3. For example, if we have the following inheritance tree, A2 A5 / \ \ A1 A4 \ / \ A3 -- A6 The find_all_inheritors() checks existence of directly inherited relations at A1, ... , A6 without any duplications, because this function does not intend to compute how many times was it inherited. The find_all_inheritors_with_inhcount() in V3 patch checks existence of directly inherited relations, even if the target relation is already checked, because it also has to return the times to be inherited from a common origin. In this example, it considers the above structure is same as the following tree. In this diagram, we can find A4 and A5 twice, and A6 thrice. A5 / A2 - A4 - A6 \ A1 \ A3 - A4 - A6 \ \ A6 A5 Thus, the test_rename.sql was the worst case test for V3 also. However, I don't think we should keep the bug in the next release. The CVS HEAD's performance is the result of omission for necessary checks. I think we should back to the V3 patch approach, and also reconsider the logic in ATPrepAlterColumnType(). Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
The attached patch is revised one based on the V3 approach. The only difference from V3 is that it also applies checks on the AT_AlterColumnType option, not only renameatt(). The performance was almost same as the V3 case. > * CVS HEAD > 0.828s > 0.828s > 0.833s > 0.829s > 0.838s - ALTER RENAME TO with V5 patch 2.419s 2.418s 2.418s 2.426s I also checked ALTER ... TYPE cases. It is relatively heavy operation than renameatt(), so its affects was relatively smaller. - ALTER ... TYPE with CVS HEAD 28.888s 29.948s 30.738s 30.600s - ALTER ... TYPE with V5 patch 28.067s 28.212s 28.038s 29.497s (2010/01/26 10:10), KaiGai Kohei wrote: > (2010/01/26 1:11), Bernd Helmle wrote: >> >> >> --On 25. Januar 2010 11:39:21 +0900 KaiGai Kohei<kaigai@ak.jp.nec.com> >> wrote: >> >>> (echo "CREATE TABLE t (a int);" >>> for i in `seq 0 9`; do >>> echo "CREATE TABLE s$i (b int) INHERITS(t);" >>> for j in `seq 0 9`; do >>> echo "CREATE TABLE v$i$j (c int) INHERITS(s$i);" >>> for k in `seq 0 9`; do >>> echo "CREATE TABLE w$i$j$k (d int) INHERITS(v$i$j);" >>> for l in `seq 0 9`; do >>> echo "CREATE TABLE x$i$j$k$l (e int) INHERITS(w$i$j$k);" >>> done >>> done >>> done >>> done) | psql test >> >> Well, each table inherits one table in your test. In my test, I inherit >> from multiple tables for each table. My script generates the following >> inheritance tree (and wins a price of copy& paste ugliness, see >> attachment): >> >> A1, A2, A3, ..., Am >> B1 INHERITS(A1...A10), B2 INHERITS(A1...A10, B3 INHERITS(A1...A10), ...Bn >> C1 INHERITS(B1...B10), C2 INHERITS(B1...B10), ... Co >> D1 INHERITS(C1...C10), ..., Dp >> >> m = 10 >> n = 10 >> o = 10 >> p = 1000 >> >> Repeating this on my MacBook gives: >> >> ALTER TABLE a1 RENAME COLUMN acol1 TO xyz; >> >> -HEAD: >> >> Time: 382,427 ms >> Time: 375,974 ms >> Time: 385,478 ms >> Time: 371,067 ms >> Time: 410,834 ms >> Time: 386,382 ms >> >> Recent V4 patch: >> >> Time: 6065,673 ms >> Time: 3823,206 ms >> Time: 4037,933 ms >> Time: 3873,029 ms >> Time: 3899,607 ms >> Time: 3963,308 ms > > Hmm... I also could observe similar result in 4 times iteration of > ALTER TABLE with your test_rename.sql. > I agree the recent V4 patch is poor in performance perspective. > > * CVS HEAD > 0.828s > 0.828s > 0.833s > 0.829s > 0.838s > > * Rcent V4 patch: > 10.283s > 10.135s > 10.107s > 10.382s > 10.162s > > * Previous V3 patch: > 2.607s > 2.429s > 2.431s > 2.436s > 2.428s > > The V3 patch is designed to compute an expected inhcount for each relations > to be altered at first, then it shall be compared to pg_attribute.inhcount > to be renamed. > > Basically, its execution cost is same order except for a case when a relation > has diamond inheritance tree. > > The find_all_inheritors() does not check child relations which is already > scanned. However, in this case, we have to check how many times is the child > relation inherited from a common origin. > I guess it is reason of the different between -HEAD and V3. > > For example, if we have the following inheritance tree, > > A2 A5 > / \ \ > A1 A4 > \ / \ > A3 -- A6 > > The find_all_inheritors() checks existence of directly inherited relations > at A1, ... , A6 without any duplications, because this function does not > intend to compute how many times was it inherited. > > The find_all_inheritors_with_inhcount() in V3 patch checks existence of > directly inherited relations, even if the target relation is already checked, > because it also has to return the times to be inherited from a common origin. > In this example, it considers the above structure is same as the following > tree. In this diagram, we can find A4 and A5 twice, and A6 thrice. > > A5 > / > A2 - A4 - A6 > \ > A1 > \ > A3 - A4 - A6 > \ \ > A6 A5 > > Thus, the test_rename.sql was the worst case test for V3 also. > However, I don't think we should keep the bug in the next release. > The CVS HEAD's performance is the result of omission for necessary checks. > > I think we should back to the V3 patch approach, and also reconsider > the logic in ATPrepAlterColumnType(). > > Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
Вложения
2010/1/27 KaiGai Kohei <kaigai@ak.jp.nec.com>: > The attached patch is revised one based on the V3 approach. > The only difference from V3 is that it also applies checks on the > AT_AlterColumnType option, not only renameatt(). I think I was clear about what the next step was for this patch in my previous email, but let me try again. http://archives.postgresql.org/pgsql-hackers/2010-01/msg02407.php See also Tom's comments here: http://archives.postgresql.org/pgsql-hackers/2010-01/msg00110.php I don't believe that either Tom or I are prepared to commit a patch based on this approach, at least not unless someone makes an attempt to do it the other way and finds an even more serious problem. If you're not interested in rewriting the patch along the lines Tom suggested, then we should just mark this as Returned with Feedback and move on. ...Robert
(2010/01/27 23:29), Robert Haas wrote: > 2010/1/27 KaiGai Kohei<kaigai@ak.jp.nec.com>: >> The attached patch is revised one based on the V3 approach. >> The only difference from V3 is that it also applies checks on the >> AT_AlterColumnType option, not only renameatt(). > > I think I was clear about what the next step was for this patch in my > previous email, but let me try again. > > http://archives.postgresql.org/pgsql-hackers/2010-01/msg02407.php > > See also Tom's comments here: > > http://archives.postgresql.org/pgsql-hackers/2010-01/msg00110.php > > I don't believe that either Tom or I are prepared to commit a patch > based on this approach, at least not unless someone makes an attempt > to do it the other way and finds an even more serious problem. If > you're not interested in rewriting the patch along the lines Tom > suggested, then we should just mark this as Returned with Feedback and > move on. The V3/V5 patch was the rewritten one based on the Tom's comment, as is. It counts the expected inhcount at the first find_all_inheritors() time at once, and it compares the pg_attribute.attinhcount. (In actually, find_all_inheritors() does not have a capability to count the number of merged from a common origin, so I newly defined the find_all_inheritors_with_inhcount().) Am I missing something? Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
On Wed, Jan 27, 2010 at 10:17 AM, KaiGai Kohei <kaigai@kaigai.gr.jp> wrote: > (2010/01/27 23:29), Robert Haas wrote: >> >> 2010/1/27 KaiGai Kohei<kaigai@ak.jp.nec.com>: >>> >>> The attached patch is revised one based on the V3 approach. >>> The only difference from V3 is that it also applies checks on the >>> AT_AlterColumnType option, not only renameatt(). >> >> I think I was clear about what the next step was for this patch in my >> previous email, but let me try again. >> >> http://archives.postgresql.org/pgsql-hackers/2010-01/msg02407.php >> >> See also Tom's comments here: >> >> http://archives.postgresql.org/pgsql-hackers/2010-01/msg00110.php >> >> I don't believe that either Tom or I are prepared to commit a patch >> based on this approach, at least not unless someone makes an attempt >> to do it the other way and finds an even more serious problem. If >> you're not interested in rewriting the patch along the lines Tom >> suggested, then we should just mark this as Returned with Feedback and >> move on. > > The V3/V5 patch was the rewritten one based on the Tom's comment, as is. > It counts the expected inhcount at the first find_all_inheritors() time > at once, and it compares the pg_attribute.attinhcount. > (In actually, find_all_inheritors() does not have a capability to count > the number of merged from a common origin, so I newly defined the > find_all_inheritors_with_inhcount().) > > Am I missing something? Err... I'm not sure. I thought I understood what the different versions of this patch were doing, but apparently I'm all confused. I'll take another look at this. Bernd (or anyone), feel free to take a look in parallel. More eyes would be helpful... ...Robert
--On 27. Januar 2010 15:42:45 -0500 Robert Haas <robertmhaas@gmail.com> wrote: > > Bernd (or anyone), feel free to take a look in parallel. More eyes > would be helpful... I've planned to look at this tomorrow when i'm back in office. -- Thanks Bernd
On Wed, Jan 27, 2010 at 3:42 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Jan 27, 2010 at 10:17 AM, KaiGai Kohei <kaigai@kaigai.gr.jp> wrote: >> (2010/01/27 23:29), Robert Haas wrote: >>> >>> 2010/1/27 KaiGai Kohei<kaigai@ak.jp.nec.com>: >>>> >>>> The attached patch is revised one based on the V3 approach. >>>> The only difference from V3 is that it also applies checks on the >>>> AT_AlterColumnType option, not only renameatt(). >>> >>> I think I was clear about what the next step was for this patch in my >>> previous email, but let me try again. >>> >>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg02407.php >>> >>> See also Tom's comments here: >>> >>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg00110.php >>> >>> I don't believe that either Tom or I are prepared to commit a patch >>> based on this approach, at least not unless someone makes an attempt >>> to do it the other way and finds an even more serious problem. If >>> you're not interested in rewriting the patch along the lines Tom >>> suggested, then we should just mark this as Returned with Feedback and >>> move on. >> >> The V3/V5 patch was the rewritten one based on the Tom's comment, as is. >> It counts the expected inhcount at the first find_all_inheritors() time >> at once, and it compares the pg_attribute.attinhcount. >> (In actually, find_all_inheritors() does not have a capability to count >> the number of merged from a common origin, so I newly defined the >> find_all_inheritors_with_inhcount().) >> >> Am I missing something? > > Err... I'm not sure. I thought I understood what the different > versions of this patch were doing, but apparently I'm all confused. > I'll take another look at this. OK, I took a look at this. It's busted: test=# create table top (a integer); CREATE TABLE test=# create table upper1 () inherits (top); CREATE TABLE test=# create table upper2 () inherits (top); CREATE TABLE test=# create table lower1 () inherits (upper1, upper2); NOTICE: merging multiple inherited definitions of column "a" CREATE TABLE test=# create table lower2 () inherits (upper1, upper2); NOTICE: merging multiple inherited definitions of column "a" CREATE TABLE test=# create table spoiler (a integer); CREATE TABLE test=# create table bottom () inherits (lower1, lower2, spoiler); NOTICE: merging multiple inherited definitions of column "a" NOTICE: merging multiple inherited definitions of column "a" CREATE TABLE test=# alter table top rename a to b; ALTER TABLE test=# select * from spoiler; ERROR: could not find inherited attribute "a" of relation "bottom" Also, there is a compiler warning due to an unused variable that should be fixed. I'll mark this Waiting on Author. ...Robert
(2010/01/28 5:42), Robert Haas wrote: > On Wed, Jan 27, 2010 at 10:17 AM, KaiGai Kohei<kaigai@kaigai.gr.jp> wrote: >> (2010/01/27 23:29), Robert Haas wrote: >>> >>> 2010/1/27 KaiGai Kohei<kaigai@ak.jp.nec.com>: >>>> >>>> The attached patch is revised one based on the V3 approach. >>>> The only difference from V3 is that it also applies checks on the >>>> AT_AlterColumnType option, not only renameatt(). >>> >>> I think I was clear about what the next step was for this patch in my >>> previous email, but let me try again. >>> >>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg02407.php >>> >>> See also Tom's comments here: >>> >>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg00110.php >>> >>> I don't believe that either Tom or I are prepared to commit a patch >>> based on this approach, at least not unless someone makes an attempt >>> to do it the other way and finds an even more serious problem. If >>> you're not interested in rewriting the patch along the lines Tom >>> suggested, then we should just mark this as Returned with Feedback and >>> move on. >> >> The V3/V5 patch was the rewritten one based on the Tom's comment, as is. >> It counts the expected inhcount at the first find_all_inheritors() time >> at once, and it compares the pg_attribute.attinhcount. >> (In actually, find_all_inheritors() does not have a capability to count >> the number of merged from a common origin, so I newly defined the >> find_all_inheritors_with_inhcount().) >> >> Am I missing something? > > Err... I'm not sure. I thought I understood what the different > versions of this patch were doing, but apparently I'm all confused. > I'll take another look at this. Just to be safe, I'd like to introduce the differences between revisions. * The V2 patch http://archives.postgresql.org/message-id/4B3F6353.3080105@kaigai.gr.jp It just checked whether the pg_attribute.inhcount is larger than 1, or not. But it was problematic when diamond-inheritance case. * The V3 patch http://archives.postgresql.org/message-id/4B41BB04.2070609@ak.jp.nec.com It computed an expected-inhcount for each relations when renameatt() picks up all the child relations on the top of recursion level. Its cost to walk on the inheritance tree is similar to existing code except for a case when we have many diamond-inheritance, because find_all_inheritors() does not need to walk on the child relations that was already checked, but find_all_inheritors_with_inhcount() has to walk on these children to compute multiplicity. See the following example, T2 / \ T1 T4 - T5 \ / T3 In this case, find_all_inheritors() encounter T4 with T1-T2 path and T1-T3 path. But it does not need to scan T5 on the second time, because it already chains OID of the T4 and T5 with the first walks, and list_concat_unique_oid() does not add duplicated OIDs anymore. But find_all_inheritors_with_inhcount() needs to walks on the T4-T5 path to compute the number of times being inherited, even if second walks. Your testcase emphasized this difference so much. But, in my opinion, it is quite natural because the existing code does not apply necessary checks. * The V4 patch http://archives.postgresql.org/message-id/4B4EC1F1.30108@ak.jp.nec.com I found out ALTER COLUMN TYPE also has same issue. But ATPrepAlterColumnType() did recursive scan using ATSimpleRecursion(), so it needs to modify the logic in this function. At that time, I preferred to compute an expected inhcount for each recursion level, rather than modifying the logic in ATPrepAlterColumnType(), although its cost was larger than find_all_inheritors_with_inhcount(). * The V5 patch http://archives.postgresql.org/message-id/4B5FFE4B.1060505@ak.jp.nec.com We can find out a performance matter in the V4 patch, so I reverted the base logic into V3 approach. In addition to the reverting, I also modified the ATPrepAlterColumnType() to check whether the column to be altered is inherited from multiple origin, or not. > Bernd (or anyone), feel free to take a look in parallel. More eyes > would be helpful... > > ...Robert > -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
(2010/01/28 6:58), Robert Haas wrote: > On Wed, Jan 27, 2010 at 3:42 PM, Robert Haas<robertmhaas@gmail.com> wrote: >> On Wed, Jan 27, 2010 at 10:17 AM, KaiGai Kohei<kaigai@kaigai.gr.jp> wrote: >>> (2010/01/27 23:29), Robert Haas wrote: >>>> >>>> 2010/1/27 KaiGai Kohei<kaigai@ak.jp.nec.com>: >>>>> >>>>> The attached patch is revised one based on the V3 approach. >>>>> The only difference from V3 is that it also applies checks on the >>>>> AT_AlterColumnType option, not only renameatt(). >>>> >>>> I think I was clear about what the next step was for this patch in my >>>> previous email, but let me try again. >>>> >>>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg02407.php >>>> >>>> See also Tom's comments here: >>>> >>>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg00110.php >>>> >>>> I don't believe that either Tom or I are prepared to commit a patch >>>> based on this approach, at least not unless someone makes an attempt >>>> to do it the other way and finds an even more serious problem. If >>>> you're not interested in rewriting the patch along the lines Tom >>>> suggested, then we should just mark this as Returned with Feedback and >>>> move on. >>> >>> The V3/V5 patch was the rewritten one based on the Tom's comment, as is. >>> It counts the expected inhcount at the first find_all_inheritors() time >>> at once, and it compares the pg_attribute.attinhcount. >>> (In actually, find_all_inheritors() does not have a capability to count >>> the number of merged from a common origin, so I newly defined the >>> find_all_inheritors_with_inhcount().) >>> >>> Am I missing something? >> >> Err... I'm not sure. I thought I understood what the different >> versions of this patch were doing, but apparently I'm all confused. >> I'll take another look at this. > > OK, I took a look at this. It's busted: > > test=# create table top (a integer); > CREATE TABLE > test=# create table upper1 () inherits (top); > CREATE TABLE > test=# create table upper2 () inherits (top); > CREATE TABLE > test=# create table lower1 () inherits (upper1, upper2); > NOTICE: merging multiple inherited definitions of column "a" > CREATE TABLE > test=# create table lower2 () inherits (upper1, upper2); > NOTICE: merging multiple inherited definitions of column "a" > CREATE TABLE > test=# create table spoiler (a integer); > CREATE TABLE > test=# create table bottom () inherits (lower1, lower2, spoiler); > NOTICE: merging multiple inherited definitions of column "a" > NOTICE: merging multiple inherited definitions of column "a" > CREATE TABLE > test=# alter table top rename a to b; > ALTER TABLE > test=# select * from spoiler; > ERROR: could not find inherited attribute "a" of relation "bottom" Hmm, indeed, this logic (V3/V5) is busted. The idea of V4 patch can also handle this case correctly, although it is lesser in performance. I wonder whether it is really unacceptable cost in performance, or not. Basically, I assume ALTER TABLE RENAME/TYPE is not frequent operations, and I don't think this bugfix will damage to the reputation of PostgreSQL. Where should we go on the next? Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
2010/1/27 KaiGai Kohei <kaigai@ak.jp.nec.com>: > Hmm, indeed, this logic (V3/V5) is busted. > > The idea of V4 patch can also handle this case correctly, although it > is lesser in performance. > I wonder whether it is really unacceptable cost in performance, or not. > Basically, I assume ALTER TABLE RENAME/TYPE is not frequent operations, > and I don't think this bugfix will damage to the reputation of PostgreSQL. > > Where should we go on the next? Isn't the problem here just that the following comment is 100% wrong? /* * Unlike find_all_inheritors(), we need to walk on child relations * that have diamond inheritance tree, because this function has to * return correct expected inhecount to the caller. */ It seems to me that the right solution here is to just add one more argument to find_all_inheritors(), something like List **expected_inh_count. Am I missing something? ...Robert
(2010/01/29 0:46), Robert Haas wrote: > 2010/1/27 KaiGai Kohei<kaigai@ak.jp.nec.com>: >> Hmm, indeed, this logic (V3/V5) is busted. >> >> The idea of V4 patch can also handle this case correctly, although it >> is lesser in performance. >> I wonder whether it is really unacceptable cost in performance, or not. >> Basically, I assume ALTER TABLE RENAME/TYPE is not frequent operations, >> and I don't think this bugfix will damage to the reputation of PostgreSQL. >> >> Where should we go on the next? > > Isn't the problem here just that the following comment is 100% wrong? > > /* > * Unlike find_all_inheritors(), we need to walk on > child relations > * that have diamond inheritance tree, because this > function has to > * return correct expected inhecount to the caller. > */ > > It seems to me that the right solution here is to just add one more > argument to find_all_inheritors(), something like List > **expected_inh_count. > > Am I missing something? The find_all_inheritors() does not walk on child relations more than two times, even if a child has multiple parents inherited from common origin, because list_concat_unique_oid() ignores the given OID if it is already on the list. It means all the child relations under the relation already walked on does not checked anywhere. (Of course, this assumption is correct for the purpose of find_all_inheritors() with minimum cost.) What we want to do here is to compute the number of times a certain child relation is inherited from a common origin; it shall be the expected-inhcount. So, we need an arrangement to the logic. For example, see the following diagram. T2 / \ T1 T4---T5 \ / T3 If we call find_all_inheritors() with T1. The find_inheritance_children() returns T2 and T3 for T1. Then, it calls find_inheritance_children() for T2, and it returns T4. Then, it calls find_inheritance_children() for T3, and it returns T4, but it is already in the "rels_list", so list_concat_unique_oid() ignores it. Then, it calls find_inheritance_children() for T4, and it returns T5. In this example, we want the expected inhcount for T2 and T3 should be 1, for T4 and T5 should be 2. However, it walks on T4 and T5 only once, so they will have 1 incorrectly. Even if we count up the ignored OID (T4), find_all_inheritors() does not walk on T5, because it is already walked on obviously when T4 is ignored. However, my V3/V5 logic is also busted when a certain relation is inherited from a relation which has multiple parents. Right now, we have only the V4 logic which works correctly.... Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
2010/1/28 KaiGai Kohei <kaigai@ak.jp.nec.com>: > (2010/01/29 0:46), Robert Haas wrote: >> 2010/1/27 KaiGai Kohei<kaigai@ak.jp.nec.com>: >>> Hmm, indeed, this logic (V3/V5) is busted. >>> >>> The idea of V4 patch can also handle this case correctly, although it >>> is lesser in performance. >>> I wonder whether it is really unacceptable cost in performance, or not. >>> Basically, I assume ALTER TABLE RENAME/TYPE is not frequent operations, >>> and I don't think this bugfix will damage to the reputation of PostgreSQL. >>> >>> Where should we go on the next? >> >> Isn't the problem here just that the following comment is 100% wrong? >> >> /* >> * Unlike find_all_inheritors(), we need to walk on >> child relations >> * that have diamond inheritance tree, because this >> function has to >> * return correct expected inhecount to the caller. >> */ >> >> It seems to me that the right solution here is to just add one more >> argument to find_all_inheritors(), something like List >> **expected_inh_count. >> >> Am I missing something? > > The find_all_inheritors() does not walk on child relations more than > two times, even if a child has multiple parents inherited from common > origin, because list_concat_unique_oid() ignores the given OID if it > is already on the list. It means all the child relations under the > relation already walked on does not checked anywhere. (Of course, > this assumption is correct for the purpose of find_all_inheritors() > with minimum cost.) > > What we want to do here is to compute the number of times a certain > child relation is inherited from a common origin; it shall be the > expected-inhcount. So, we need an arrangement to the logic. > > For example, see the following diagram. > > T2 > / \ > T1 T4---T5 > \ / > T3 > > If we call find_all_inheritors() with T1. The find_inheritance_children() > returns T2 and T3 for T1. > Then, it calls find_inheritance_children() for T2, and it returns T4. > Then, it calls find_inheritance_children() for T3, and it returns T4, but > it is already in the "rels_list", so list_concat_unique_oid() ignores it. > Then, it calls find_inheritance_children() for T4, and it returns T5. > > In this example, we want the expected inhcount for T2 and T3 should be 1, > for T4 and T5 should be 2. However, it walks on T4 and T5 only once, so > they will have 1 incorrectly. > Even if we count up the ignored OID (T4), find_all_inheritors() does not > walk on T5, because it is already walked on obviously when T4 is ignored. I think the count for T5 should be 1, and I think that the count for T4 can easily be made to be 2 by coding the algorithm correctly. ...Robert
(2010/01/29 9:29), Robert Haas wrote: > 2010/1/28 KaiGai Kohei<kaigai@ak.jp.nec.com>: >> (2010/01/29 0:46), Robert Haas wrote: >>> 2010/1/27 KaiGai Kohei<kaigai@ak.jp.nec.com>: >>>> Hmm, indeed, this logic (V3/V5) is busted. >>>> >>>> The idea of V4 patch can also handle this case correctly, although it >>>> is lesser in performance. >>>> I wonder whether it is really unacceptable cost in performance, or not. >>>> Basically, I assume ALTER TABLE RENAME/TYPE is not frequent operations, >>>> and I don't think this bugfix will damage to the reputation of PostgreSQL. >>>> >>>> Where should we go on the next? >>> >>> Isn't the problem here just that the following comment is 100% wrong? >>> >>> /* >>> * Unlike find_all_inheritors(), we need to walk on >>> child relations >>> * that have diamond inheritance tree, because this >>> function has to >>> * return correct expected inhecount to the caller. >>> */ >>> >>> It seems to me that the right solution here is to just add one more >>> argument to find_all_inheritors(), something like List >>> **expected_inh_count. >>> >>> Am I missing something? >> >> The find_all_inheritors() does not walk on child relations more than >> two times, even if a child has multiple parents inherited from common >> origin, because list_concat_unique_oid() ignores the given OID if it >> is already on the list. It means all the child relations under the >> relation already walked on does not checked anywhere. (Of course, >> this assumption is correct for the purpose of find_all_inheritors() >> with minimum cost.) >> >> What we want to do here is to compute the number of times a certain >> child relation is inherited from a common origin; it shall be the >> expected-inhcount. So, we need an arrangement to the logic. >> >> For example, see the following diagram. >> >> T2 >> / \ >> T1 T4---T5 >> \ / >> T3 >> >> If we call find_all_inheritors() with T1. The find_inheritance_children() >> returns T2 and T3 for T1. >> Then, it calls find_inheritance_children() for T2, and it returns T4. >> Then, it calls find_inheritance_children() for T3, and it returns T4, but >> it is already in the "rels_list", so list_concat_unique_oid() ignores it. >> Then, it calls find_inheritance_children() for T4, and it returns T5. >> >> In this example, we want the expected inhcount for T2 and T3 should be 1, >> for T4 and T5 should be 2. However, it walks on T4 and T5 only once, so >> they will have 1 incorrectly. >> Even if we count up the ignored OID (T4), find_all_inheritors() does not >> walk on T5, because it is already walked on obviously when T4 is ignored. > > I think the count for T5 should be 1, and I think that the count for > T4 can easily be made to be 2 by coding the algorithm correctly. Ahh, it is right. I was confused. Is it possible to introduce the logic mathematical-strictly? Now I'm considering whether the find_all_inheritors() logic is suitable for any cases, or not. What we want to compute here is: WITH RECURSIVE r AS ( SELECT 't1'::regclass AS inhrelid UNION ALL SELECT c.inhrelid FROM pg_inherits c, r WHERE r.inhrelid= c.inhparent ) -- r is all the child relations inherited from 't1' SELECT inhrelid::regclass, count(*) FROMpg_inherits WHERE inhparent IN (SELECT inhrelid FROM r) GROUP BY inhrelid; Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
(2010/01/29 9:58), KaiGai Kohei wrote: > (2010/01/29 9:29), Robert Haas wrote: >> 2010/1/28 KaiGai Kohei<kaigai@ak.jp.nec.com>: >>> (2010/01/29 0:46), Robert Haas wrote: >>>> 2010/1/27 KaiGai Kohei<kaigai@ak.jp.nec.com>: >>>>> Hmm, indeed, this logic (V3/V5) is busted. >>>>> >>>>> The idea of V4 patch can also handle this case correctly, although it >>>>> is lesser in performance. >>>>> I wonder whether it is really unacceptable cost in performance, or not. >>>>> Basically, I assume ALTER TABLE RENAME/TYPE is not frequent operations, >>>>> and I don't think this bugfix will damage to the reputation of PostgreSQL. >>>>> >>>>> Where should we go on the next? >>>> >>>> Isn't the problem here just that the following comment is 100% wrong? >>>> >>>> /* >>>> * Unlike find_all_inheritors(), we need to walk on >>>> child relations >>>> * that have diamond inheritance tree, because this >>>> function has to >>>> * return correct expected inhecount to the caller. >>>> */ >>>> >>>> It seems to me that the right solution here is to just add one more >>>> argument to find_all_inheritors(), something like List >>>> **expected_inh_count. >>>> >>>> Am I missing something? >>> >>> The find_all_inheritors() does not walk on child relations more than >>> two times, even if a child has multiple parents inherited from common >>> origin, because list_concat_unique_oid() ignores the given OID if it >>> is already on the list. It means all the child relations under the >>> relation already walked on does not checked anywhere. (Of course, >>> this assumption is correct for the purpose of find_all_inheritors() >>> with minimum cost.) >>> >>> What we want to do here is to compute the number of times a certain >>> child relation is inherited from a common origin; it shall be the >>> expected-inhcount. So, we need an arrangement to the logic. >>> >>> For example, see the following diagram. >>> >>> T2 >>> / \ >>> T1 T4---T5 >>> \ / >>> T3 >>> >>> If we call find_all_inheritors() with T1. The find_inheritance_children() >>> returns T2 and T3 for T1. >>> Then, it calls find_inheritance_children() for T2, and it returns T4. >>> Then, it calls find_inheritance_children() for T3, and it returns T4, but >>> it is already in the "rels_list", so list_concat_unique_oid() ignores it. >>> Then, it calls find_inheritance_children() for T4, and it returns T5. >>> >>> In this example, we want the expected inhcount for T2 and T3 should be 1, >>> for T4 and T5 should be 2. However, it walks on T4 and T5 only once, so >>> they will have 1 incorrectly. >>> Even if we count up the ignored OID (T4), find_all_inheritors() does not >>> walk on T5, because it is already walked on obviously when T4 is ignored. >> >> I think the count for T5 should be 1, and I think that the count for >> T4 can easily be made to be 2 by coding the algorithm correctly. > > Ahh, it is right. I was confused. > > Is it possible to introduce the logic mathematical-strictly? > Now I'm considering whether the find_all_inheritors() logic is suitable > for any cases, or not. I modified the logic to compute an expected inhcount of the child relations. What we want to compute here is to compute the number of times a certain relation being inherited directly from any relations delivered from a unique origin. If the column to be renamed is eventually inherited from a common origin, its attinhcount is not larger than the expected inhcount. > WITH RECURSIVE r AS ( > SELECT 't1'::regclass AS inhrelid > UNION ALL > SELECT c.inhrelid FROM pg_inherits c, r WHERE r.inhrelid = c.inhparent > ) -- r is all the child relations inherited from 't1' > SELECT inhrelid::regclass, count(*) FROM pg_inherits > WHERE inhparent IN (SELECT inhrelid FROM r) GROUP BY inhrelid; The modified logic increments the expected inhcount of the relation already walked on. If we found a child relation twice or more, it means the child relation is at the junction of the inheritance tree. In this case, we don't walk on the child relations any more, but it is not necessary, because the first path already checked it. The find_all_inheritors() is called from several points more than renameatt(), so I added find_all_inheritors_with_inhcount() which takes argument of the expected inhcount list. And, find_all_inheritors() was also modified to call find_all_inheritors_with_inhcount() with NULL argument to avoid code duplication. It is the result of Berrnd's test case. - CVS HEAD 0.895s 0.903s 0.884s 0.896s 0.892s - with V6 patch 0.972s 0.941s 0.961s 0.949s 0.946s Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
Вложения
2010/1/28 KaiGai Kohei <kaigai@ak.jp.nec.com>: > (2010/01/29 9:58), KaiGai Kohei wrote: >> (2010/01/29 9:29), Robert Haas wrote: >>> 2010/1/28 KaiGai Kohei<kaigai@ak.jp.nec.com>: >>>> (2010/01/29 0:46), Robert Haas wrote: >>>>> 2010/1/27 KaiGai Kohei<kaigai@ak.jp.nec.com>: >>>>>> Hmm, indeed, this logic (V3/V5) is busted. >>>>>> >>>>>> The idea of V4 patch can also handle this case correctly, although it >>>>>> is lesser in performance. >>>>>> I wonder whether it is really unacceptable cost in performance, or not. >>>>>> Basically, I assume ALTER TABLE RENAME/TYPE is not frequent operations, >>>>>> and I don't think this bugfix will damage to the reputation of PostgreSQL. >>>>>> >>>>>> Where should we go on the next? >>>>> >>>>> Isn't the problem here just that the following comment is 100% wrong? >>>>> >>>>> /* >>>>> * Unlike find_all_inheritors(), we need to walk on >>>>> child relations >>>>> * that have diamond inheritance tree, because this >>>>> function has to >>>>> * return correct expected inhecount to the caller. >>>>> */ >>>>> >>>>> It seems to me that the right solution here is to just add one more >>>>> argument to find_all_inheritors(), something like List >>>>> **expected_inh_count. >>>>> >>>>> Am I missing something? >>>> >>>> The find_all_inheritors() does not walk on child relations more than >>>> two times, even if a child has multiple parents inherited from common >>>> origin, because list_concat_unique_oid() ignores the given OID if it >>>> is already on the list. It means all the child relations under the >>>> relation already walked on does not checked anywhere. (Of course, >>>> this assumption is correct for the purpose of find_all_inheritors() >>>> with minimum cost.) >>>> >>>> What we want to do here is to compute the number of times a certain >>>> child relation is inherited from a common origin; it shall be the >>>> expected-inhcount. So, we need an arrangement to the logic. >>>> >>>> For example, see the following diagram. >>>> >>>> T2 >>>> / \ >>>> T1 T4---T5 >>>> \ / >>>> T3 >>>> >>>> If we call find_all_inheritors() with T1. The find_inheritance_children() >>>> returns T2 and T3 for T1. >>>> Then, it calls find_inheritance_children() for T2, and it returns T4. >>>> Then, it calls find_inheritance_children() for T3, and it returns T4, but >>>> it is already in the "rels_list", so list_concat_unique_oid() ignores it. >>>> Then, it calls find_inheritance_children() for T4, and it returns T5. >>>> >>>> In this example, we want the expected inhcount for T2 and T3 should be 1, >>>> for T4 and T5 should be 2. However, it walks on T4 and T5 only once, so >>>> they will have 1 incorrectly. >>>> Even if we count up the ignored OID (T4), find_all_inheritors() does not >>>> walk on T5, because it is already walked on obviously when T4 is ignored. >>> >>> I think the count for T5 should be 1, and I think that the count for >>> T4 can easily be made to be 2 by coding the algorithm correctly. >> >> Ahh, it is right. I was confused. >> >> Is it possible to introduce the logic mathematical-strictly? >> Now I'm considering whether the find_all_inheritors() logic is suitable >> for any cases, or not. > > I modified the logic to compute an expected inhcount of the child relations. > > What we want to compute here is to compute the number of times a certain > relation being inherited directly from any relations delivered from a unique > origin. If the column to be renamed is eventually inherited from a common > origin, its attinhcount is not larger than the expected inhcount. > >> WITH RECURSIVE r AS ( >> SELECT 't1'::regclass AS inhrelid >> UNION ALL >> SELECT c.inhrelid FROM pg_inherits c, r WHERE r.inhrelid = c.inhparent >> ) -- r is all the child relations inherited from 't1' >> SELECT inhrelid::regclass, count(*) FROM pg_inherits >> WHERE inhparent IN (SELECT inhrelid FROM r) GROUP BY inhrelid; > > The modified logic increments the expected inhcount of the relation already > walked on. If we found a child relation twice or more, it means the child > relation is at the junction of the inheritance tree. > In this case, we don't walk on the child relations any more, but it is not > necessary, because the first path already checked it. > > The find_all_inheritors() is called from several points more than renameatt(), > so I added find_all_inheritors_with_inhcount() which takes argument of the > expected inhcount list. And, find_all_inheritors() was also modified to call > find_all_inheritors_with_inhcount() with NULL argument to avoid code duplication. > > It is the result of Berrnd's test case. > > - CVS HEAD > 0.895s > 0.903s > 0.884s > 0.896s > 0.892s > > - with V6 patch > 0.972s > 0.941s > 0.961s > 0.949s > 0.946s Well, that seems a lot better. Unfortunately, I can't commit this code: it's mind-bogglingly ugly. I don't believe that duplicating find_all_inheritors is the right solution (a point I've mentioned previously), and it's certainly not right to use typeName->location as a place to store an unrelated attribute inheritance count. There is also at least one superfluous variable renaming; and the recursion handling looks pretty grotty to me, too. I wonder if we should just leave this alone for 9.0 and revisit the issue after doing some of the previously-proposed refactoring for 9.1.The amount of time we're spending trying to fix thisis somewhat out of proportion to the importance of the problem. ...Robert
(2010/01/30 3:36), Robert Haas wrote: > 2010/1/28 KaiGai Kohei<kaigai@ak.jp.nec.com>: >> (2010/01/29 9:58), KaiGai Kohei wrote: >>> (2010/01/29 9:29), Robert Haas wrote: >>>> 2010/1/28 KaiGai Kohei<kaigai@ak.jp.nec.com>: >>>>> (2010/01/29 0:46), Robert Haas wrote: >>>>>> 2010/1/27 KaiGai Kohei<kaigai@ak.jp.nec.com>: >>>>>>> Hmm, indeed, this logic (V3/V5) is busted. >>>>>>> >>>>>>> The idea of V4 patch can also handle this case correctly, although it >>>>>>> is lesser in performance. >>>>>>> I wonder whether it is really unacceptable cost in performance, or not. >>>>>>> Basically, I assume ALTER TABLE RENAME/TYPE is not frequent operations, >>>>>>> and I don't think this bugfix will damage to the reputation of PostgreSQL. >>>>>>> >>>>>>> Where should we go on the next? >>>>>> >>>>>> Isn't the problem here just that the following comment is 100% wrong? >>>>>> >>>>>> /* >>>>>> * Unlike find_all_inheritors(), we need to walk on >>>>>> child relations >>>>>> * that have diamond inheritance tree, because this >>>>>> function has to >>>>>> * return correct expected inhecount to the caller. >>>>>> */ >>>>>> >>>>>> It seems to me that the right solution here is to just add one more >>>>>> argument to find_all_inheritors(), something like List >>>>>> **expected_inh_count. >>>>>> >>>>>> Am I missing something? >>>>> >>>>> The find_all_inheritors() does not walk on child relations more than >>>>> two times, even if a child has multiple parents inherited from common >>>>> origin, because list_concat_unique_oid() ignores the given OID if it >>>>> is already on the list. It means all the child relations under the >>>>> relation already walked on does not checked anywhere. (Of course, >>>>> this assumption is correct for the purpose of find_all_inheritors() >>>>> with minimum cost.) >>>>> >>>>> What we want to do here is to compute the number of times a certain >>>>> child relation is inherited from a common origin; it shall be the >>>>> expected-inhcount. So, we need an arrangement to the logic. >>>>> >>>>> For example, see the following diagram. >>>>> >>>>> T2 >>>>> / \ >>>>> T1 T4---T5 >>>>> \ / >>>>> T3 >>>>> >>>>> If we call find_all_inheritors() with T1. The find_inheritance_children() >>>>> returns T2 and T3 for T1. >>>>> Then, it calls find_inheritance_children() for T2, and it returns T4. >>>>> Then, it calls find_inheritance_children() for T3, and it returns T4, but >>>>> it is already in the "rels_list", so list_concat_unique_oid() ignores it. >>>>> Then, it calls find_inheritance_children() for T4, and it returns T5. >>>>> >>>>> In this example, we want the expected inhcount for T2 and T3 should be 1, >>>>> for T4 and T5 should be 2. However, it walks on T4 and T5 only once, so >>>>> they will have 1 incorrectly. >>>>> Even if we count up the ignored OID (T4), find_all_inheritors() does not >>>>> walk on T5, because it is already walked on obviously when T4 is ignored. >>>> >>>> I think the count for T5 should be 1, and I think that the count for >>>> T4 can easily be made to be 2 by coding the algorithm correctly. >>> >>> Ahh, it is right. I was confused. >>> >>> Is it possible to introduce the logic mathematical-strictly? >>> Now I'm considering whether the find_all_inheritors() logic is suitable >>> for any cases, or not. >> >> I modified the logic to compute an expected inhcount of the child relations. >> >> What we want to compute here is to compute the number of times a certain >> relation being inherited directly from any relations delivered from a unique >> origin. If the column to be renamed is eventually inherited from a common >> origin, its attinhcount is not larger than the expected inhcount. >> >>> WITH RECURSIVE r AS ( >>> SELECT 't1'::regclass AS inhrelid >>> UNION ALL >>> SELECT c.inhrelid FROM pg_inherits c, r WHERE r.inhrelid = c.inhparent >>> ) -- r is all the child relations inherited from 't1' >>> SELECT inhrelid::regclass, count(*) FROM pg_inherits >>> WHERE inhparent IN (SELECT inhrelid FROM r) GROUP BY inhrelid; >> >> The modified logic increments the expected inhcount of the relation already >> walked on. If we found a child relation twice or more, it means the child >> relation is at the junction of the inheritance tree. >> In this case, we don't walk on the child relations any more, but it is not >> necessary, because the first path already checked it. >> >> The find_all_inheritors() is called from several points more than renameatt(), >> so I added find_all_inheritors_with_inhcount() which takes argument of the >> expected inhcount list. And, find_all_inheritors() was also modified to call >> find_all_inheritors_with_inhcount() with NULL argument to avoid code duplication. >> >> It is the result of Berrnd's test case. >> >> - CVS HEAD >> 0.895s >> 0.903s >> 0.884s >> 0.896s >> 0.892s >> >> - with V6 patch >> 0.972s >> 0.941s >> 0.961s >> 0.949s >> 0.946s > > Well, that seems a lot better. Unfortunately, I can't commit this > code: it's mind-bogglingly ugly. I don't believe that duplicating > find_all_inheritors is the right solution (a point I've mentioned > previously), and it's certainly not right to use typeName->location as > a place to store an unrelated attribute inheritance count. There is > also at least one superfluous variable renaming; and the recursion > handling looks pretty grotty to me, too. > > I wonder if we should just leave this alone for 9.0 and revisit the > issue after doing some of the previously-proposed refactoring for 9.1. > The amount of time we're spending trying to fix this is somewhat out > of proportion to the importance of the problem. At least, I think we can fix the bug on renameatt() case in clean way, although we need an additional recursion handling in ATPrepAlterColumnType() to fix ALTER COLUMN TYPE cases. Should it focus on only the original renameatt() matter in this stage? Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
(2010/02/01 8:41), KaiGai Kohei wrote: > (2010/01/30 3:36), Robert Haas wrote: >> 2010/1/28 KaiGai Kohei<kaigai@ak.jp.nec.com>: >>> (2010/01/29 9:58), KaiGai Kohei wrote: >>>> (2010/01/29 9:29), Robert Haas wrote: >>>>> 2010/1/28 KaiGai Kohei<kaigai@ak.jp.nec.com>: >>>>>> (2010/01/29 0:46), Robert Haas wrote: >>>>>>> 2010/1/27 KaiGai Kohei<kaigai@ak.jp.nec.com>: >>>>>>>> Hmm, indeed, this logic (V3/V5) is busted. >>>>>>>> >>>>>>>> The idea of V4 patch can also handle this case correctly, although it >>>>>>>> is lesser in performance. >>>>>>>> I wonder whether it is really unacceptable cost in performance, or not. >>>>>>>> Basically, I assume ALTER TABLE RENAME/TYPE is not frequent operations, >>>>>>>> and I don't think this bugfix will damage to the reputation of PostgreSQL. >>>>>>>> >>>>>>>> Where should we go on the next? >>>>>>> >>>>>>> Isn't the problem here just that the following comment is 100% wrong? >>>>>>> >>>>>>> /* >>>>>>> * Unlike find_all_inheritors(), we need to walk on >>>>>>> child relations >>>>>>> * that have diamond inheritance tree, because this >>>>>>> function has to >>>>>>> * return correct expected inhecount to the caller. >>>>>>> */ >>>>>>> >>>>>>> It seems to me that the right solution here is to just add one more >>>>>>> argument to find_all_inheritors(), something like List >>>>>>> **expected_inh_count. >>>>>>> >>>>>>> Am I missing something? >>>>>> >>>>>> The find_all_inheritors() does not walk on child relations more than >>>>>> two times, even if a child has multiple parents inherited from common >>>>>> origin, because list_concat_unique_oid() ignores the given OID if it >>>>>> is already on the list. It means all the child relations under the >>>>>> relation already walked on does not checked anywhere. (Of course, >>>>>> this assumption is correct for the purpose of find_all_inheritors() >>>>>> with minimum cost.) >>>>>> >>>>>> What we want to do here is to compute the number of times a certain >>>>>> child relation is inherited from a common origin; it shall be the >>>>>> expected-inhcount. So, we need an arrangement to the logic. >>>>>> >>>>>> For example, see the following diagram. >>>>>> >>>>>> T2 >>>>>> / \ >>>>>> T1 T4---T5 >>>>>> \ / >>>>>> T3 >>>>>> >>>>>> If we call find_all_inheritors() with T1. The find_inheritance_children() >>>>>> returns T2 and T3 for T1. >>>>>> Then, it calls find_inheritance_children() for T2, and it returns T4. >>>>>> Then, it calls find_inheritance_children() for T3, and it returns T4, but >>>>>> it is already in the "rels_list", so list_concat_unique_oid() ignores it. >>>>>> Then, it calls find_inheritance_children() for T4, and it returns T5. >>>>>> >>>>>> In this example, we want the expected inhcount for T2 and T3 should be 1, >>>>>> for T4 and T5 should be 2. However, it walks on T4 and T5 only once, so >>>>>> they will have 1 incorrectly. >>>>>> Even if we count up the ignored OID (T4), find_all_inheritors() does not >>>>>> walk on T5, because it is already walked on obviously when T4 is ignored. >>>>> >>>>> I think the count for T5 should be 1, and I think that the count for >>>>> T4 can easily be made to be 2 by coding the algorithm correctly. >>>> >>>> Ahh, it is right. I was confused. >>>> >>>> Is it possible to introduce the logic mathematical-strictly? >>>> Now I'm considering whether the find_all_inheritors() logic is suitable >>>> for any cases, or not. >>> >>> I modified the logic to compute an expected inhcount of the child relations. >>> >>> What we want to compute here is to compute the number of times a certain >>> relation being inherited directly from any relations delivered from a unique >>> origin. If the column to be renamed is eventually inherited from a common >>> origin, its attinhcount is not larger than the expected inhcount. >>> >>>> WITH RECURSIVE r AS ( >>>> SELECT 't1'::regclass AS inhrelid >>>> UNION ALL >>>> SELECT c.inhrelid FROM pg_inherits c, r WHERE r.inhrelid = c.inhparent >>>> ) -- r is all the child relations inherited from 't1' >>>> SELECT inhrelid::regclass, count(*) FROM pg_inherits >>>> WHERE inhparent IN (SELECT inhrelid FROM r) GROUP BY inhrelid; >>> >>> The modified logic increments the expected inhcount of the relation already >>> walked on. If we found a child relation twice or more, it means the child >>> relation is at the junction of the inheritance tree. >>> In this case, we don't walk on the child relations any more, but it is not >>> necessary, because the first path already checked it. >>> >>> The find_all_inheritors() is called from several points more than renameatt(), >>> so I added find_all_inheritors_with_inhcount() which takes argument of the >>> expected inhcount list. And, find_all_inheritors() was also modified to call >>> find_all_inheritors_with_inhcount() with NULL argument to avoid code duplication. >>> >>> It is the result of Berrnd's test case. >>> >>> - CVS HEAD >>> 0.895s >>> 0.903s >>> 0.884s >>> 0.896s >>> 0.892s >>> >>> - with V6 patch >>> 0.972s >>> 0.941s >>> 0.961s >>> 0.949s >>> 0.946s >> >> Well, that seems a lot better. Unfortunately, I can't commit this >> code: it's mind-bogglingly ugly. I don't believe that duplicating >> find_all_inheritors is the right solution (a point I've mentioned >> previously), and it's certainly not right to use typeName->location as >> a place to store an unrelated attribute inheritance count. There is >> also at least one superfluous variable renaming; and the recursion >> handling looks pretty grotty to me, too. >> >> I wonder if we should just leave this alone for 9.0 and revisit the >> issue after doing some of the previously-proposed refactoring for 9.1. >> The amount of time we're spending trying to fix this is somewhat out >> of proportion to the importance of the problem. > > At least, I think we can fix the bug on renameatt() case in clean way, > although we need an additional recursion handling in ATPrepAlterColumnType() > to fix ALTER COLUMN TYPE cases. > > Should it focus on only the original renameatt() matter in this stage? The attached patch modified find_all_inheritors() to return the list of expected inhcount, if List * pointer is given. And, it focuses on only the bugs in renameatt() case. Also, the ALTER COLUMN TYPE case should be also fixed in the 9.1 release (or 9.0.1?). Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
Вложения
2010/1/31 KaiGai Kohei <kaigai@ak.jp.nec.com>: > The attached patch modified find_all_inheritors() to return the list of > expected inhcount, if List * pointer is given. And, it focuses on only > the bugs in renameatt() case. I have cleaned up and simplified this patch. Attached is the version I intend to commit. Changes: - make find_all_inheritors return the list of OIDs, as previously, instead of using an out parameter - undo some useless variable renamings - undo some useless whitespace changes - rework comments - fix regression tests to avoid using the same alias twice in the same query - add an ORDER BY clause to the regression tests so that they pass on my machine - improve the names of some of the new variables - instead of adding an additional argument to renameatt(), just replace the existing 'bool recursing' with 'int expected_parents'. This allows merging the two versions of the "cannot rename inherited column" message together, which seems like a reasonably good idea to me, though if someone has a better idea that's fine. I didn't think the one additional word added to the message provided enough clarity to make it worth creating another translatable string. > Also, the ALTER COLUMN TYPE case should be also fixed in the 9.1 release > (or 9.0.1?). If the fix is something we could commit for 9.0.1, then we ought to do it now before 9.0 is released. If you want to submit a follow-on patch to address ALTER COLUMN TYPE once this is committed, then please do so. ...Robert
Вложения
Robert Haas <robertmhaas@gmail.com> writes: > I have cleaned up and simplified this patch. Attached is the version > I intend to commit. Changes: Minor suggestions: I think the names like "rel_parents" would read better as "rel_numparents" etc. As-is, the reader could be forgiven for expecting that this will be a list of parent relation OIDs or some such. The new loop added within find_all_inheritors could really do with an addition to the comments, along the line of "If a child is already seen, increment the corresponding numparents count". I don't trust the proposed "order by attrelid" business in the regression test --- once in a blue moon, that will fail because the OID counter wrapped around mid-test, and we'll get an unreproducible bug report. I'd suggest order by attrelid::regclass::text. Looks sane otherwise. regards, tom lane
On Mon, Feb 1, 2010 at 1:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I have cleaned up and simplified this patch. Attached is the version >> I intend to commit. Changes: > > Minor suggestions: > > I think the names like "rel_parents" would read better as > "rel_numparents" etc. As-is, the reader could be forgiven for expecting > that this will be a list of parent relation OIDs or some such. I thought about that but ended up being lazy and leaving it the way I had it. I'll go be un-lazy. > The new loop added within find_all_inheritors could really do with an > addition to the comments, along the line of "If a child is already > seen, increment the corresponding numparents count". OK. > I don't trust the proposed "order by attrelid" business in the > regression test --- once in a blue moon, that will fail because the > OID counter wrapped around mid-test, and we'll get an unreproducible > bug report. I'd suggest order by attrelid::regclass::text. Wow, didn't think of that. Will change. > Looks sane otherwise. Thanks for the review. ...Robert
On Mon, Feb 1, 2010 at 1:40 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> Looks sane otherwise. > > Thanks for the review. Oh, one other thing. Should we backpatch this, or just apply to HEAD? ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > Oh, one other thing. Should we backpatch this, or just apply to HEAD? Just HEAD imo. Without any complaints from the field, I don't think it's worth taking any risks for. It's not like multiple inheritance is heavily used ... regards, tom lane
On Mon, Feb 1, 2010 at 2:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Oh, one other thing. Should we backpatch this, or just apply to HEAD? > > Just HEAD imo. Without any complaints from the field, I don't think > it's worth taking any risks for. It's not like multiple inheritance > is heavily used ... OK, done. ...Robert
(2010/02/02 3:01), Robert Haas wrote: > 2010/1/31 KaiGai Kohei<kaigai@ak.jp.nec.com>: >> The attached patch modified find_all_inheritors() to return the list of >> expected inhcount, if List * pointer is given. And, it focuses on only >> the bugs in renameatt() case. > > I have cleaned up and simplified this patch. Attached is the version > I intend to commit. Changes: > > - make find_all_inheritors return the list of OIDs, as previously, > instead of using an out parameter > - undo some useless variable renamings > - undo some useless whitespace changes > - rework comments > - fix regression tests to avoid using the same alias twice in the same query > - add an ORDER BY clause to the regression tests so that they pass on my machine > - improve the names of some of the new variables > - instead of adding an additional argument to renameatt(), just > replace the existing 'bool recursing' with 'int expected_parents'. > This allows merging the two versions of the "cannot rename inherited > column" message together, which seems like a reasonably good idea to > me, though if someone has a better idea that's fine. I didn't think > the one additional word added to the message provided enough clarity > to make it worth creating another translatable string. Thanks for your checks. >> Also, the ALTER COLUMN TYPE case should be also fixed in the 9.1 release >> (or 9.0.1?). > > If the fix is something we could commit for 9.0.1, then we ought to do > it now before 9.0 is released. If you want to submit a follow-on > patch to address ALTER COLUMN TYPE once this is committed, then please > do so. The attached patch also fixes ALTER COLUMN TYPE case. It replaced the 'recursing' argument in ATPrepCmd() by 'expected_parents', and it is delivered to ATPrepAlterColumnType(). The logic to forbid altering the column type is same as renameatt(). ATSimpleRecursion() is also modified to use forboth() to call ATPrepCmd() recursively. One concern is at ATOneLevelRecursion() which is called by ATPrepAddColumn() only, and it also calls ATPrepCmd() for the direct children. Right now, I set 1 on the 'expected_parents'. However, IMO, here is no reason why we cannot rewrite the ATPrepAddColumn() using ATSimpleRecursion(). Eventually, ATExecAddColumn() shall be invoked several times for same column, if the inheritance tree has diamond-inheritance structure. And, it increments pg_attribute.attinhcount except for the first invocation. If we store the 'expected_parents' on the ColumnDef->inhcount, we don't need to call the ATExecAddColumn() more than once in a single ALTER TABLE command. Any comments? And, when should we do it? 9.0? 9.1? Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
Вложения
(2010/02/02 9:48), KaiGai Kohei wrote: >>> Also, the ALTER COLUMN TYPE case should be also fixed in the 9.1 release >>> (or 9.0.1?). >> >> If the fix is something we could commit for 9.0.1, then we ought to do >> it now before 9.0 is released. If you want to submit a follow-on >> patch to address ALTER COLUMN TYPE once this is committed, then please >> do so. > > The attached patch also fixes ALTER COLUMN TYPE case. > > It replaced the 'recursing' argument in ATPrepCmd() by 'expected_parents', > and it is delivered to ATPrepAlterColumnType(). The logic to forbid altering > the column type is same as renameatt(). > ATSimpleRecursion() is also modified to use forboth() to call ATPrepCmd() > recursively. > > One concern is at ATOneLevelRecursion() which is called by ATPrepAddColumn() > only, and it also calls ATPrepCmd() for the direct children. > Right now, I set 1 on the 'expected_parents'. However, IMO, here is no reason > why we cannot rewrite the ATPrepAddColumn() using ATSimpleRecursion(). > > Eventually, ATExecAddColumn() shall be invoked several times for same column, > if the inheritance tree has diamond-inheritance structure. And, it increments > pg_attribute.attinhcount except for the first invocation. > If we store the 'expected_parents' on the ColumnDef->inhcount, we don't need > to call the ATExecAddColumn() more than once in a single ALTER TABLE command. > > Any comments? And, when should we do it? 9.0? 9.1? The attached one also clean up ATPrepAddColumn() and ATExecAddColumn() code, not only ATPrepAlterColumnType(), according to what I mentioned above. There are two regression test fails, because it does not call ATExecAddColumn() twice or more in diamond-inheritance cases, so it does not notice merging definitions of columns. If we should go on right now, I'll add and fix regression tests, and submit a formal patch again. If not, I'll work it later. Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
Вложения
KaiGai Kohei <kaigai@ak.jp.nec.com> writes: > The attached one also clean up ATPrepAddColumn() and ATExecAddColumn() code, > not only ATPrepAlterColumnType(), according to what I mentioned above. What exactly do you claim is wrong with the ADD COLUMN case? regards, tom lane
(2010/02/02 11:09), Tom Lane wrote: > KaiGai Kohei<kaigai@ak.jp.nec.com> writes: >> The attached one also clean up ATPrepAddColumn() and ATExecAddColumn() code, >> not only ATPrepAlterColumnType(), according to what I mentioned above. > > What exactly do you claim is wrong with the ADD COLUMN case? ADD COLUMN case works correctly, but it takes unnecessary loops, because the find_all_inheritors() didn't provide the value to be set on the new pg_attribute.attinhcount. I'm saying it can be rewritten in more graceful manner using the new expected_parents argument. Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
2010/2/1 KaiGai Kohei <kaigai@ak.jp.nec.com>: > (2010/02/02 11:09), Tom Lane wrote: >> KaiGai Kohei<kaigai@ak.jp.nec.com> writes: >>> The attached one also clean up ATPrepAddColumn() and ATExecAddColumn() code, >>> not only ATPrepAlterColumnType(), according to what I mentioned above. >> >> What exactly do you claim is wrong with the ADD COLUMN case? > > ADD COLUMN case works correctly, but it takes unnecessary loops, > because the find_all_inheritors() didn't provide the value to be > set on the new pg_attribute.attinhcount. > > I'm saying it can be rewritten in more graceful manner using the > new expected_parents argument. The subject line of this thread is getting less and less appropriate to the content thereof. I am not in favor of doing anything for 9.0 that is not a bug fix. ...Robert
KaiGai Kohei <kaigai@ak.jp.nec.com> writes: > (2010/02/02 11:09), Tom Lane wrote: >> KaiGai Kohei<kaigai@ak.jp.nec.com> writes: >>> The attached one also clean up ATPrepAddColumn() and ATExecAddColumn() code, >>> not only ATPrepAlterColumnType(), according to what I mentioned above. >> >> What exactly do you claim is wrong with the ADD COLUMN case? > ADD COLUMN case works correctly, but it takes unnecessary loops, > because the find_all_inheritors() didn't provide the value to be > set on the new pg_attribute.attinhcount. > I'm saying it can be rewritten in more graceful manner using the > new expected_parents argument. I tend to think that if it ain't broke don't fix it; the odds of actually breaking it are too high. I don't really find the new coding more graceful, anyway ... regards, tom lane
(2010/02/02 11:31), Robert Haas wrote: > 2010/2/1 KaiGai Kohei<kaigai@ak.jp.nec.com>: >> (2010/02/02 11:09), Tom Lane wrote: >>> KaiGai Kohei<kaigai@ak.jp.nec.com> writes: >>>> The attached one also clean up ATPrepAddColumn() and ATExecAddColumn() code, >>>> not only ATPrepAlterColumnType(), according to what I mentioned above. >>> >>> What exactly do you claim is wrong with the ADD COLUMN case? >> >> ADD COLUMN case works correctly, but it takes unnecessary loops, >> because the find_all_inheritors() didn't provide the value to be >> set on the new pg_attribute.attinhcount. >> >> I'm saying it can be rewritten in more graceful manner using the >> new expected_parents argument. > > The subject line of this thread is getting less and less appropriate > to the content thereof. > > I am not in favor of doing anything for 9.0 that is not a bug fix. Are you talking about ATPrepAddColumn() only? Or both of ATPrepAddColumn() and ATPrepAlterColumnType()? My motivation to clean up ATPrepAddColumn() is less than the bugfix. Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
2010/2/1 KaiGai Kohei <kaigai@ak.jp.nec.com>: > (2010/02/02 11:31), Robert Haas wrote: >> 2010/2/1 KaiGai Kohei<kaigai@ak.jp.nec.com>: >>> (2010/02/02 11:09), Tom Lane wrote: >>>> KaiGai Kohei<kaigai@ak.jp.nec.com> writes: >>>>> The attached one also clean up ATPrepAddColumn() and ATExecAddColumn() code, >>>>> not only ATPrepAlterColumnType(), according to what I mentioned above. >>>> >>>> What exactly do you claim is wrong with the ADD COLUMN case? >>> >>> ADD COLUMN case works correctly, but it takes unnecessary loops, >>> because the find_all_inheritors() didn't provide the value to be >>> set on the new pg_attribute.attinhcount. >>> >>> I'm saying it can be rewritten in more graceful manner using the >>> new expected_parents argument. >> >> The subject line of this thread is getting less and less appropriate >> to the content thereof. >> >> I am not in favor of doing anything for 9.0 that is not a bug fix. > > Are you talking about ATPrepAddColumn() only? Or both of ATPrepAddColumn() > and ATPrepAlterColumnType()? > > My motivation to clean up ATPrepAddColumn() is less than the bugfix. I'm making a general statement - if something is BROKEN (like the rename case we just dealt with), we should look at fixing it. If it's just something that could be cleaned up or done more nicely, we should leave it alone for now. ...Robert
(2010/02/02 11:44), Robert Haas wrote: > 2010/2/1 KaiGai Kohei<kaigai@ak.jp.nec.com>: >> (2010/02/02 11:31), Robert Haas wrote: >>> 2010/2/1 KaiGai Kohei<kaigai@ak.jp.nec.com>: >>>> (2010/02/02 11:09), Tom Lane wrote: >>>>> KaiGai Kohei<kaigai@ak.jp.nec.com> writes: >>>>>> The attached one also clean up ATPrepAddColumn() and ATExecAddColumn() code, >>>>>> not only ATPrepAlterColumnType(), according to what I mentioned above. >>>>> >>>>> What exactly do you claim is wrong with the ADD COLUMN case? >>>> >>>> ADD COLUMN case works correctly, but it takes unnecessary loops, >>>> because the find_all_inheritors() didn't provide the value to be >>>> set on the new pg_attribute.attinhcount. >>>> >>>> I'm saying it can be rewritten in more graceful manner using the >>>> new expected_parents argument. >>> >>> The subject line of this thread is getting less and less appropriate >>> to the content thereof. >>> >>> I am not in favor of doing anything for 9.0 that is not a bug fix. >> >> Are you talking about ATPrepAddColumn() only? Or both of ATPrepAddColumn() >> and ATPrepAlterColumnType()? >> >> My motivation to clean up ATPrepAddColumn() is less than the bugfix. > > I'm making a general statement - if something is BROKEN (like the > rename case we just dealt with), we should look at fixing it. If it's > just something that could be cleaned up or done more nicely, we should > leave it alone for now. OK, Please forget the second patch. The former patch (pgsql-fix-inherit-attype.1.patch) just fixes the matter in ALTER COLUMN TYPE case. Do you think it is a reasonable change for the 9.0 release? Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
2010/2/1 KaiGai Kohei <kaigai@ak.jp.nec.com>: >> I'm making a general statement - if something is BROKEN (like the >> rename case we just dealt with), we should look at fixing it. If it's >> just something that could be cleaned up or done more nicely, we should >> leave it alone for now. > > OK, Please forget the second patch. > > The former patch (pgsql-fix-inherit-attype.1.patch) just fixes the matter > in ALTER COLUMN TYPE case. Do you think it is a reasonable change for > the 9.0 release? After reviewing this, I see that it doesn't really make sense to fix ALTER COLUMN TYPE without rewriting ADD COLUMN to use ATSimpleRecursion(). If we're going to go to the trouble of refactoring the ATPrepCmd() interface, we certainly want to get as much benefit out of that as we can. That having been said... I'm leery about undertaking a substantial refactoring of this code at this point in the release cycle. We have less than two weeks remaining before the end of the final CommitFest, so it doesn't seem like a good time to be starting new projects, especially because we still have 16 patches that need we need to grind through in less than 2 weeks, and I want to give some of those my attention, too. So I would like to push this out to 9.1 and revisit it when I can give it the amount of time that I believe is needed to do it right. ...Robert
(2010/02/02 23:50), Robert Haas wrote: > 2010/2/1 KaiGai Kohei<kaigai@ak.jp.nec.com>: >>> I'm making a general statement - if something is BROKEN (like the >>> rename case we just dealt with), we should look at fixing it. If it's >>> just something that could be cleaned up or done more nicely, we should >>> leave it alone for now. >> >> OK, Please forget the second patch. >> >> The former patch (pgsql-fix-inherit-attype.1.patch) just fixes the matter >> in ALTER COLUMN TYPE case. Do you think it is a reasonable change for >> the 9.0 release? > > After reviewing this, I see that it doesn't really make sense to fix > ALTER COLUMN TYPE without rewriting ADD COLUMN to use > ATSimpleRecursion(). If we're going to go to the trouble of > refactoring the ATPrepCmd() interface, we certainly want to get as > much benefit out of that as we can. > > That having been said... I'm leery about undertaking a substantial > refactoring of this code at this point in the release cycle. We have > less than two weeks remaining before the end of the final CommitFest, > so it doesn't seem like a good time to be starting new projects, > especially because we still have 16 patches that need we need to grind > through in less than 2 weeks, and I want to give some of those my > attention, too. So I would like to push this out to 9.1 and revisit > it when I can give it the amount of time that I believe is needed to > do it right. OK, I'll work on remained part of this fix in the v9.1. Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>