Обсуждение: Problems with ENUM type manipulation in 9.1

Поиск
Список
Период
Сортировка

Problems with ENUM type manipulation in 9.1

От
Дата:
Hello,

I've encountered some problems with the updated ENUM in PosgreSQL 9.1:

1. We can use ALTER TYPE to add enum values, but there is no matching comma=
nd to remove values, which makes this an incomplete solution.

2. "ALTER TYPE ... ADD cannot be executed from a function or multi-command =
string" (or from a transaction block), which makes it quite useless for our=
 purposes.  We update our databases using SQL patches.  Patches are applied=
 in a single transaction, so that any failure during execution causes the e=
ntire patch to be rolled back. This command cannot be made part of such a p=
atch. Even if that wasn't an issue, we would still have a problem, because =
the command cannot be used in a DO block. Why would we want to do that? In =
order to check first what values are already in the ENUM, lest we attempt t=
o add an existing value.

3. In earlier PostgreSQL versions we used custom procedures (based on proce=
dures developed by Dmitry Koterov http://en.dklab.ru/lib/dklab_postgresql_e=
num/) to add and delete ENUM values. These procedures manipulate pg_enum ta=
ble directly. I've updated them to take into account the new column in pg_e=
num that was added in 9.1. However, although adding enums this way seems to=
 work (new values appear in the pg_enum table), attempting to use these new=
 enums results in errors, such as this:  "enum value 41983 not found in cac=
he for enum [...]". Is it possible to reset this cache after altering the p=
g_enum table?

Thanks,
Dmitry


Dmitry Epstein | Developer
=20
Allied Testing
T + 7 495 544 48 69 Ext 417

www.alliedtesting.com
We Deliver Quality.

Re: Problems with ENUM type manipulation in 9.1

От
"Kevin Grittner"
Дата:
<depstein@alliedtesting.com> wrote:

> I've encountered some problems with the updated ENUM in PosgreSQL
> 9.1:

No matter how I tilt my head, I can't see any of those issues as
bugs.  You have two feature requests and a question about how to
work around problems you're having with direct modifications to the
system tables.  These probably all belong on the pgsql-general list.
The exception would be that if you are thinking about implementing
the requested features and contributing them to community
PostgreSQL, you could discuss that on the -hackers list.

-Kevin

Re: Problems with ENUM type manipulation in 9.1

От
Merlin Moncure
Дата:
On Tue, Sep 27, 2011 at 5:06 AM,  <depstein@alliedtesting.com> wrote:
> Hello,
>
> I've encountered some problems with the updated ENUM in PosgreSQL 9.1:
>
> 1. We can use ALTER TYPE to add enum values, but there is no matching com=
mand to remove values, which makes this an incomplete solution.

you can manually delete from pg_enum.  this is dangerous; if you
delete an enum value that is in use anywhere, behavior is undefined.

> 2. "ALTER TYPE ... ADD cannot be executed from a function or multi-comman=
d string" (or from a transaction block), which makes it quite useless for o=
ur purposes. =A0We update our databases using SQL patches. =A0Patches are a=
pplied in a single transaction, so that any failure during execution causes=
 the entire patch to be rolled back. This command cannot be made part of su=
ch a patch. Even if that wasn't an issue, we would still have a problem, be=
cause the command cannot be used in a DO block. Why would we want to do tha=
t? In order to check first what values are already in the ENUM, lest we att=
empt to add an existing value.

sql patches work fine.  sql script !=3D multi command string.  The
difference is that you are trying to send several commands in a single
round trip (PQexec) vs sending one query at a time which is the way
you are supposed to do it (and this works perfectly fine with
transactions).  ALTER/ADD not working in-function is a minor annoying
inconvience I'll admit.

> 3. In earlier PostgreSQL versions we used custom procedures (based on pro=
cedures developed by Dmitry Koterov http://en.dklab.ru/lib/dklab_postgresql=
_enum/) to add and delete ENUM values. These procedures manipulate pg_enum =
table directly. I've updated them to take into account the new column in pg=
_enum that was added in 9.1. However, although adding enums this way seems =
to work (new values appear in the pg_enum table), attempting to use these n=
ew enums results in errors, such as this: =A0"enum value 41983 not found in=
 cache for enum [...]". Is it possible to reset this cache after altering t=
he pg_enum table?

restarting the session should do it -- as I said, manipulating pg_enum
is dangerous.  agree with Kevin -- these are not bugs.

merlin

Re: Problems with ENUM type manipulation in 9.1

От
Дата:
> -----Original Message-----
> From: Merlin Moncure [mailto:mmoncure@gmail.com]
> Sent: Tuesday, September 27, 2011 10:31 PM
> > 1. We can use ALTER TYPE to add enum values, but there is no matching
> command to remove values, which makes this an incomplete solution.
>=20
> you can manually delete from pg_enum.  this is dangerous; if you delete an
> enum value that is in use anywhere, behavior is undefined.

True: Postgres doesn't do any checks when deleting enum values, which contr=
asts with the general practice of disallowing the removal of objects that a=
re still referenced elsewhere in the database.  That seems like a bug to me=
.  Anyway, the procedure that we used (based on http://en.dklab.ru/lib/dkla=
b_postgresql_enum/) does the necessary checks before removing enum values.

>=20
> > 2. "ALTER TYPE ... ADD cannot be executed from a function or multi-
> command string" (or from a transaction block), which makes it quite usele=
ss
> for our purposes. =A0We update our databases using SQL patches. =A0Patche=
s are
> applied in a single transaction, so that any failure during execution cau=
ses the
> entire patch to be rolled back. This command cannot be made part of such a
> patch. Even if that wasn't an issue, we would still have a problem, becau=
se
> the command cannot be used in a DO block. Why would we want to do that?
> In order to check first what values are already in the ENUM, lest we atte=
mpt
> to add an existing value.
>=20
> sql patches work fine.  sql script !=3D multi command string.  The differ=
ence is
> that you are trying to send several commands in a single round trip (PQex=
ec)
> vs sending one query at a time which is the way you are supposed to do it
> (and this works perfectly fine with transactions).  ALTER/ADD not working=
 in-
> function is a minor annoying inconvience I'll admit.

ALTER TYPE ... ADD VALUE does not work inside transaction blocks, period, w=
hether they are executed as a multi-command string or one query at a time. =
Try it:

begin;
create type test_enum as enum ('ONE', 'TWO');
alter type test_enum add value 'THREE';
drop type test_enum;=20
commit;

Whether you send the above one query at a time or as a script in psql, it w=
on't work.

What you call a "minor inconvenience" makes enum management effectively bro=
ken, at least in an industrial environment.

>=20
> > 3. In earlier PostgreSQL versions we used custom procedures (based on
> procedures developed by Dmitry Koterov
> http://en.dklab.ru/lib/dklab_postgresql_enum/) to add and delete ENUM
> values. These procedures manipulate pg_enum table directly. I've updated
> them to take into account the new column in pg_enum that was added in
> 9.1. However, although adding enums this way seems to work (new values
> appear in the pg_enum table), attempting to use these new enums results in
> errors, such as this: =A0"enum value 41983 not found in cache for enum [.=
..]". Is
> it possible to reset this cache after altering the pg_enum table?
>=20
> restarting the session should do it -- as I said, manipulating pg_enum is
> dangerous.  agree with Kevin -- these are not bugs.

It's weird. Sometimes it works when executing commands in separate transact=
ions. And sometimes the same commands don't work even after restarting Post=
gresql. Completely unpredictable.

The reason I regard these issues as bugs is because the new version broke s=
ome functionality that worked in the previous version. But if this goes und=
er feature requests, I'll move the discussion over to general.


Dmitry Epstein | Developer
=20
Allied Testing

www.alliedtesting.com
We Deliver Quality.

Re: Problems with ENUM type manipulation in 9.1

От
Merlin Moncure
Дата:
On Wed, Sep 28, 2011 at 5:21 AM,  <depstein@alliedtesting.com> wrote:
>> -----Original Message-----
>> From: Merlin Moncure [mailto:mmoncure@gmail.com]
>> Sent: Tuesday, September 27, 2011 10:31 PM
>> > 1. We can use ALTER TYPE to add enum values, but there is no matching
>> command to remove values, which makes this an incomplete solution.
>>
>> you can manually delete from pg_enum. =A0this is dangerous; if you delet=
e an
>> enum value that is in use anywhere, behavior is undefined.
>
> True: Postgres doesn't do any checks when deleting enum values, which con=
trasts with the general practice of disallowing the removal of objects that=
 are still referenced elsewhere in the database. =A0That seems like a bug t=
o me. =A0Anyway, the procedure that we used (based on http://en.dklab.ru/li=
b/dklab_postgresql_enum/) does the necessary checks before removing enum va=
lues.
>
>>
>> > 2. "ALTER TYPE ... ADD cannot be executed from a function or multi-
>> command string" (or from a transaction block), which makes it quite usel=
ess
>> for our purposes. =A0We update our databases using SQL patches. =A0Patch=
es are
>> applied in a single transaction, so that any failure during execution ca=
uses the
>> entire patch to be rolled back. This command cannot be made part of such=
 a
>> patch. Even if that wasn't an issue, we would still have a problem, beca=
use
>> the command cannot be used in a DO block. Why would we want to do that?
>> In order to check first what values are already in the ENUM, lest we att=
empt
>> to add an existing value.
>>
>> sql patches work fine. =A0sql script !=3D multi command string. =A0The d=
ifference is
>> that you are trying to send several commands in a single round trip (PQe=
xec)
>> vs sending one query at a time which is the way you are supposed to do it
>> (and this works perfectly fine with transactions). =A0ALTER/ADD not work=
ing in-
>> function is a minor annoying inconvience I'll admit.
>
> ALTER TYPE ... ADD VALUE does not work inside transaction blocks, period,=
 whether they are executed as a multi-command string or one query at a time=
. Try it:
>
> begin;
> create type test_enum as enum ('ONE', 'TWO');
> alter type test_enum add value 'THREE';
> drop type test_enum;
> commit;
>
> Whether you send the above one query at a time or as a script in psql, it=
 won't work.
>
> What you call a "minor inconvenience" makes enum management effectively b=
roken, at least in an industrial environment.

hm, I have to unfortunately agree  -- what a PITB.  this is however not a b=
ug.

merlin

Re: Problems with ENUM type manipulation in 9.1

От
Alvaro Herrera
Дата:
Excerpts from depstein's message of mié sep 28 07:21:17 -0300 2011:
> > -----Original Message-----
> > From: Merlin Moncure [mailto:mmoncure@gmail.com]
> > Sent: Tuesday, September 27, 2011 10:31 PM
> > > 1. We can use ALTER TYPE to add enum values, but there is no matching
> > command to remove values, which makes this an incomplete solution.
> >
> > you can manually delete from pg_enum.  this is dangerous; if you delete an
> > enum value that is in use anywhere, behavior is undefined.
>
> True: Postgres doesn't do any checks when deleting enum values, which contrasts with the general practice of
disallowingthe removal of objects that are still referenced elsewhere in the database.  That seems like a bug to me. 

We don't support deleting of enum values, precisely because there's no
easy way to determine if they are in use somewhere.  So there's no
reason to think that we should do any checks when "deleting enum
values".  Keep in mind that manually fiddling with the system catalogs
is not supported; if you break stuff by doing it, you get to keep both
pieces.

> Anyway, the procedure that we used (based on
> http://en.dklab.ru/lib/dklab_postgresql_enum/) does the necessary
> checks before removing enum values.

Good.  But keep in mind this is not a supported procedure.

> ALTER TYPE ... ADD VALUE does not work inside transaction blocks, period, whether they are executed as a
multi-commandstring or one query at a time. Try it: 
>
> begin;
> create type test_enum as enum ('ONE', 'TWO');
> alter type test_enum add value 'THREE';
> drop type test_enum;
> commit;
>
> Whether you send the above one query at a time or as a script in psql, it won't work.
>
> What you call a "minor inconvenience" makes enum management effectively broken, at least in an industrial
environment.

The reason it is not allowed is because it breaks stuff (I cannot
remember what).  Inconvenient, yes.  "Broken", perhaps.  But it's
working as designed.  If you're interested, you could examine the old
threads that led to this behavior and see if it can be improved.  But
just removing the check won't do.

> > restarting the session should do it -- as I said, manipulating pg_enum is
> > dangerous.  agree with Kevin -- these are not bugs.
>
> It's weird. Sometimes it works when executing commands in separate transactions. And sometimes the same commands
don'twork even after restarting Postgresql. Completely unpredictable. 
>
> The reason I regard these issues as bugs is because the new version broke some functionality that worked in the
previousversion. But if this goes under feature requests, I'll move the discussion over to general. 

Well, it's perfectly predictable if you constrain yourself to supported
operations, which updating catalogs by hand is not.   And given that it
wasn't supported when this function was written, for 8.3, we have no
responsibility for ensuring that it still works in later versions.

Note that this email contains no opinion of mine.  I am only stating
PostgreSQL Global Development Group policy.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Problems with ENUM type manipulation in 9.1

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Excerpts from depstein's message of mié sep 28 07:21:17 -0300 2011:
>> ALTER TYPE ... ADD VALUE does not work inside transaction blocks, period, whether they are executed as a
multi-commandstring or one query at a time. Try it: 

> The reason it is not allowed is because it breaks stuff (I cannot
> remember what).  Inconvenient, yes.  "Broken", perhaps.  But it's
> working as designed.  If you're interested, you could examine the old
> threads that led to this behavior and see if it can be improved.  But
> just removing the check won't do.

The comment beside the code says what it breaks:

        case T_AlterEnumStmt:    /* ALTER TYPE (enum) */

            /*
             * We disallow this in transaction blocks, because we can't cope
             * with enum OID values getting into indexes and then having their
             * defining pg_enum entries go away.
             */
            PreventTransactionChain(isTopLevel, "ALTER TYPE ... ADD");
            AlterEnum((AlterEnumStmt *) parsetree);
            break;

As Merlin says, this is not a bug.  It's a design compromise that we
made after quite some careful consideration, and we're unlikely to
reconsider it unless someone thinks of an actually better solution.
You might care to review the "WIP: extensible enums" thread in
pgsql-hackers during October 2010 to see the issues and alternatives
that were considered.

BTW, I imagine that the reason that manually adding rows to pg_enum no
longer works with any reliability at all is that the manual procedure
isn't cognizant of the new rules about even vs odd OIDs in pg_enum.
Not that it really worked before --- once the OID counter wrapped
around, you'd be pretty well screwed.  As Alvaro says, manual
alterations of the system catalogs never have been supported, meaning
that we will never offer a guarantee that something that (more or less)
worked in a previous release will still work in newer ones.

            regards, tom lane

Re: Problems with ENUM type manipulation in 9.1

От
Josh Kupershmidt
Дата:
On Wed, Sep 28, 2011 at 10:51 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Excerpts from depstein's message of mi=E9 sep 28 07:21:17 -0300 2011:
>> Anyway, the procedure that we used (based on
>> http://en.dklab.ru/lib/dklab_postgresql_enum/) does the necessary
>> checks before removing enum values.

Not exactly; that code is rife with race conditions. For instance, how
does the "Check data references" loop ensure that previously-checked
tables don't get a new row containing the forbidden enum_elem before
the function is finished?

Josh

Re: Problems with ENUM type manipulation in 9.1

От
Merlin Moncure
Дата:
On Wed, Sep 28, 2011 at 10:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Excerpts from depstein's message of mi=E9 sep 28 07:21:17 -0300 2011:
>>> ALTER TYPE ... ADD VALUE does not work inside transaction blocks, perio=
d, whether they are executed as a multi-command string or one query at a ti=
me. Try it:
>
>> The reason it is not allowed is because it breaks stuff (I cannot
>> remember what). =A0Inconvenient, yes. =A0"Broken", perhaps. =A0But it's
>> working as designed. =A0If you're interested, you could examine the old
>> threads that led to this behavior and see if it can be improved. =A0But
>> just removing the check won't do.
>
> The comment beside the code says what it breaks:
>
> =A0 =A0 =A0 =A0case T_AlterEnumStmt: =A0 =A0/* ALTER TYPE (enum) */
>
> =A0 =A0 =A0 =A0 =A0 =A0/*
> =A0 =A0 =A0 =A0 =A0 =A0 * We disallow this in transaction blocks, because=
 we can't cope
> =A0 =A0 =A0 =A0 =A0 =A0 * with enum OID values getting into indexes and t=
hen having their
> =A0 =A0 =A0 =A0 =A0 =A0 * defining pg_enum entries go away.
> =A0 =A0 =A0 =A0 =A0 =A0 */
> =A0 =A0 =A0 =A0 =A0 =A0PreventTransactionChain(isTopLevel, "ALTER TYPE ..=
. ADD");
> =A0 =A0 =A0 =A0 =A0 =A0AlterEnum((AlterEnumStmt *) parsetree);
> =A0 =A0 =A0 =A0 =A0 =A0break;
>
> As Merlin says, this is not a bug. =A0It's a design compromise that we
> made after quite some careful consideration, and we're unlikely to
> reconsider it unless someone thinks of an actually better solution.
> You might care to review the "WIP: extensible enums" thread in
> pgsql-hackers during October 2010 to see the issues and alternatives
> that were considered.
>
> BTW, I imagine that the reason that manually adding rows to pg_enum no
> longer works with any reliability at all is that the manual procedure
> isn't cognizant of the new rules about even vs odd OIDs in pg_enum.
> Not that it really worked before --- once the OID counter wrapped
> around, you'd be pretty well screwed. =A0As Alvaro says, manual
> alterations of the system catalogs never have been supported, meaning
> that we will never offer a guarantee that something that (more or less)
> worked in a previous release will still work in newer ones.

Yeah -- also it's good to point out even/odd issue with pg_enum.  just
about everyone hacked pg_enum previously, and it's good to spread the
word this no longer works :-(.  That said, the new enum enhancements
(oid wraparound issue aside) ISTM I can't help but see as a somewhat
of a regression, since previously you could (hackily) work on them
in-transaction, and now you basically can't. No use in crying now, but
in the future I think any DDL that doesn't support in-transaction use
should be regarded with a great deal of skepticism.

merlin

Re: Problems with ENUM type manipulation in 9.1

От
Tom Lane
Дата:
Josh Kupershmidt <schmiddy@gmail.com> writes:
>> Excerpts from depstein's message of mié sep 28 07:21:17 -0300 2011:
>>> Anyway, the procedure that we used (based on
>>> http://en.dklab.ru/lib/dklab_postgresql_enum/) does the necessary
>>> checks before removing enum values.

> Not exactly; that code is rife with race conditions. For instance, how
> does the "Check data references" loop ensure that previously-checked
> tables don't get a new row containing the forbidden enum_elem before
> the function is finished?

It's worse than that: even if you have in fact deleted all occurrences
of a specific enum OID from the tables, that OID might still be lurking
in a btree index on an enum column.  If you delete the pg_enum entry,
and the OID is odd (meaning that the pg_enum entry must be consulted to
find out how to sort it), you just broke that index.

You might think you could get out of that by VACUUM'ing to ensure that
dead index entries get cleaned out, but that is not good enough.  The
problem OID could have gotten copied into a btree page boundary value or
non-leaf-page entry.  If that happens, the OID will most likely never
disappear from the index, short of a REINDEX; and this is also the worst
case for index corruption, since we must be able to compare other OID
values to the non-leaf-page entry to figure out which leaf page to
descend to in searches.

In short, the reason why this type of code hasn't been adopted into core
is that it doesn't work.

            regards, tom lane

Re: Problems with ENUM type manipulation in 9.1

От
Дата:
Thanks everyone for the explanations.  I posted a feature request for impro=
ved enum manipulation in psql-general.

Dmitry Epstein | Developer
=20
Allied Testing

www.alliedtesting.com
We Deliver Quality.