Re: What exactly is our CRC algorithm?
От | Gavin Flower |
---|---|
Тема | Re: What exactly is our CRC algorithm? |
Дата | |
Msg-id | 5435B97F.3010700@archidevsys.co.nz обсуждение исходный текст |
Ответ на | Re: What exactly is our CRC algorithm? (Andres Freund <andres@2ndquadrant.com>) |
Ответы |
Re: What exactly is our CRC algorithm?
|
Список | pgsql-hackers |
On 09/10/14 10:13, Andres Freund wrote: > On 2014-10-08 22:13:46 +0300, Heikki Linnakangas wrote: >> Hmm. So the simple, non-table driven, calculation gives the same result as >> using the lookup table using the reflected lookup code. That's expected; the >> lookup method is supposed to be the same, just faster. However, using the >> "normal" lookup code, but with a "reflected" lookup table, produces the same >> result as Postgres' algorithm. Indeed, that's what we do in PostgreSQL. But >> AFAICS, that's an incorrect combination. You're supposed to the >> non-reflected lookup table with the non-reflected lookup code; you can't mix >> and match. >> >> As far as I can tell, PostgreSQL's so-called CRC algorithm doesn't >> correspond to any bit-by-bit CRC variant and polynomial. My math skills are >> not strong enough to determine what the consequences of that are. It might >> still be a decent checksum. Or not. I couldn't tell if the good error >> detection properties of the normal CRC-32 polynomial apply to our algorithm >> or not. > Additional interesting datapoints are that hstore and ltree contain the > same tables - but properly use the reflected computation. > >> Thoughts? > It clearly seems like a bad idea to continue with this - I don't think > anybody here knows which guarantees this gives us. > > The question is how can we move away from this. There's unfortunately > two places that embed PGC32 that are likely to prove problematic when > fixing the algorithm: pg_trgm and tsgist both seem to include crc's in > their logic in a persistent way. I think we should provide > INIT/COMP/FIN_PG32 using the current algorithm for these. > > If we're switching to a saner computation, we should imo also switch to > a better polynom - CRC-32C has better error detection capabilities than > CRC32 and is available in hardware. As we're paying the price pf > breaking compat anyway... > > Arguably we could also say that given that there's been little evident > problems with the borked computation we could also switch to a much > faster hash instead of continuing to use crc... > > Greetings, > > Andres Freund > Could a 64 bit variant of some kind be useful as an option - in addition to a correct 32 bit? As most people have 64 bit processors and storage is less constrained now-a-days, as well as we tend to store much larger chunks of data. Cheers, Gavin
В списке pgsql-hackers по дате отправления: