Re: record identical operator
От | Robert Haas |
---|---|
Тема | Re: record identical operator |
Дата | |
Msg-id | CA+TgmoascbWNupoS9izktO8fQSpKe3G0n=afAV1Fg1Qf1p0RDQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: record identical operator (Stephen Frost <sfrost@snowman.net>) |
Ответы |
Re: record identical operator
(Stephen Frost <sfrost@snowman.net>)
|
Список | pgsql-hackers |
On Tue, Sep 24, 2013 at 3:22 PM, Stephen Frost <sfrost@snowman.net> wrote: > * Robert Haas (robertmhaas@gmail.com) wrote: >> Now I admit that's an arguable point. We could certainly define an >> intermediate notion of equality that is more equal than whatever = >> does, but not as equal as exact binary equality. > > I suggested it up-thread and don't recall seeing a response, so here it > is again- passing the data through the binary-out function for the type > and comparing *that* would allow us to change the interal binary > representation of data types and would be something which we could at > least explain to users as to why X isn't the same as Y according to this > binary operator. Sorry, I missed that suggestion. I'm wary of inventing a completely new way of doing this. I don't think that there's any guarantee that the send/recv functions won't expose exactly the same implementation details as a direct check for binary equality. For example, array_send() seems to directly reveal the presence or absence of a NULL bitmap. Even if there were no such anomalies today, it feels fragile to rely on a fairly-unrelated concept to have exactly the semantics we want here, and it will surely be much slower. Binary equality has existing precedent and is used in numerous places in the code for good reason. Users might be confused about the use of those semantics in those places also, but AFAICT nobody is. I think the best thing I can say about this proposal is that it would clearly be better than what we're doing now, which is just plain wrong. I don't think it's the best proposal, however. >> It is obviously true, and unavoidable, that if the query can produce >> more than one result set depending on the query plan or other factors, >> then the materialized view contents can match only one of those >> possible result sets. But you are arguing that it should be allowed >> to produce some OTHER result set, that a direct execution of the query >> could *never* have produced, and I can't see how that can be right. > > I agree that the matview shouldn't produce things which *can't* exist > through an execution of the query. I don't intend to argue against that > but I realize that's a fallout of not accepting the patch to implement > the binary comparison operators. My complaint is more generally that if > this approach needs such then there's going to be problems down the > road. No, I can't predict exactly what they are and perhaps I'm all wet > here, but this kind of binary-equality operations are something I've > tried to avoid since discovering what happens when you try to compare > a couple of floating point numbers. So, I get that. But what I think is that the problem that's coming up here is almost the flip side of that. If you are working with types that are a little bit imprecise, such as floats or citext or box, you want to use comparison strategies that have a little bit of tolerance for error. In the case of box, this is actually built into the comparison operator. In the case of floats, it's not; you as the application programmer have to deal with the fact that comparisons are imprecise - like by avoiding equality comparisons. On the other hand, if you are *replicating* those data types, then you don't want that tolerance. If you're checking whether two boxes are equal, you may indeed want the small amount of fuzziness that our comparison operators allow. But if you're copying a box or a float from one table to another, or from one database to another, you want the values copied exactly, including all of those low-order bits that tend to foul up your comparisons. That's why float8out() normally doesn't display any extra_float_digits - because you as the user shouldn't be relying on them - but pg_dump does back them up because not doing so would allow errors to propagate. Similarly here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: