Обсуждение: Fixing domain input
We've seen a couple of bug reports now about how domain constraints aren't checked during input of a parameter that's been deduced to be of a domain type, eg http://archives.postgresql.org/pgsql-interfaces/2005-07/msg00009.php http://archives.postgresql.org/pgsql-bugs/2005-07/msg00084.php There's also the long-standing bugaboo that plpgsql doesn't enforce domain constraints. In the first of these threads, I suggested hacking the parameter type resolution rules so that parameters wouldn't be assigned inferred types that are domains, but only their base types. However, that only fixes things when the parameter type is inferred --- if it's specified as a domain by the client, we'd still see the problem. And it does nothing for plpgsql. It occurs to me that a cleaner solution would be to stop giving domain types the same typinput routines as their base types. Instead, give them all a specialized routine domain_in (comparable to array_in) that first invokes the base type's input function and then applies any relevant constraint checks. Likewise for typreceive (but we'd not need to touch the output functions). This has a number of attractions: * Solves both cases of the domain-parameter problem. * Since plpgsql does all type coercions by calling output and input functions, I believe this would automatically fix the bugs in plpgsql. * Allows us to eliminate special cases for domains in parse_coerce.c, copy.c, possibly other places. The main disadvantage of it is that for domains that have CHECK constraints, it's necessary to set up an ExprContext in which the check expressions can be evaluated; and in turn that requires an ExecutorState, plus ExecInitExpr, etc. So there's a pretty fair amount of setup overhead involved, and doing that repeatedly in a series of calls is not attractive from a performance standpoint. (This may be why we didn't do it that way originally, though I don't recall any more whether it was even considered.) We could eliminate this overhead in the case of COPY by adding an API kluge that lets domain_in() detect whether it's being called inside COPY IN, and let it piggyback on COPY's EState, so that the setup overhead is still only paid once per COPY command. In other scenarios such as plpgsql I'm not sure we can afford to try to amortize the setup across multiple calls --- plpgsql is pretty cavalier about the context it calls things in, and I think we'd see huge memory leaks if we didn't free the EState before returning from domain_in(). Still, a slow feature is better than silently failing to apply the constraint, which is where we are now. Thoughts? regards, tom lane
I wrote: > It occurs to me that a cleaner solution would be to stop giving domain > types the same typinput routines as their base types. Instead, give > them all a specialized routine domain_in (comparable to array_in) that > first invokes the base type's input function and then applies any > relevant constraint checks. I did most of the work of coding this up, only to watch the idea crash and burn: datatype input routines aren't called at all for NULL values, so there's no way to enforce a NOT NULL domain constraint from the input routine. Currently trying to think of decent alternatives ... regards, tom lane
Last summer, I wrote: [ http://archives.postgresql.org/pgsql-hackers/2005-07/msg00320.php ] >> It occurs to me that a cleaner solution would be to stop giving domain >> types the same typinput routines as their base types. Instead, give >> them all a specialized routine domain_in (comparable to array_in) that >> first invokes the base type's input function and then applies any >> relevant constraint checks. > I did most of the work of coding this up, only to watch the idea > crash and burn: datatype input routines aren't called at all for > NULL values, so there's no way to enforce a NOT NULL domain constraint > from the input routine. The obvious solution to this, of course, is to allow datatype input routines to be called for NULLs. We could check the functions' strictness flag before doing so, so that the old behavior still prevails for any input function marked strict, which is most of 'em. When I first thought of this, a couple days ago, my immediate worry was that it'd open a security hole: any C-language input function that wasn't marked strict would crash the backend if passed a null input string. However, any such function is *already* a security hole, because it's been possible to call it manually for some time:select int4in(null::cstring); So there's no additional risk --- in fact, arguably having such a function crash during normal input of NULL would be a Good Thing, because it would make it far more likely that the mistake would get noticed and fixed before someone used it as a form of DOS attack. So that seems to leave only the objection that we'd still have to change all the places that call input functions. But it's not like we've not done that before (several times, even :-(). And making a change to support non-strict input functions is still way nicer than introducing explicit knowledge of domains in all these places. Comments? regards, tom lane
On Apr 4, 2006, at 10:39 , Tom Lane wrote: > So there's no additional risk --- in fact, arguably having such a > function crash during normal input of NULL would be a Good Thing, > because it would make it far more likely that the mistake would get > noticed and fixed before someone used it as a form of DOS attack. Granted, finding an error earlier than later is a Good Thing, but an Even Better Thing would be to prevent crashing the backend, and afaics (as far as that is) the change you propose doesn't hurt anything. Do you see any way to do that? I'm glad to see work being done on domains. I'm definitely learning from the discussion. Michael Glaesemann grzm myrealbox com
> I'm glad to see work being done on domains. I'm definitely learning from > the discussion. I wonder if we should implement 'GRANT USAGE ON DOMAINS' for spec compliance sometime... Chris
Michael Glaesemann <grzm@myrealbox.com> writes: > Granted, finding an error earlier than later is a Good Thing, but an > Even Better Thing would be to prevent crashing the backend, and > afaics (as far as that is) the change you propose doesn't hurt > anything. Do you see any way to do that? For starters we'd have to forbid any user-written C functions. Somehow that doesn't seem like a net win. regards, tom lane
On Apr 5, 2006, at 11:46 , Tom Lane wrote: > For starters we'd have to forbid any user-written C functions. > Somehow that doesn't seem like a net win. Ouch. Michael Glaesemann grzm myrealbox com