Re: How to generate unique hash-type id?
От | Craig Ringer |
---|---|
Тема | Re: How to generate unique hash-type id? |
Дата | |
Msg-id | 4B62E17B.50804@postnewspapers.com.au обсуждение исходный текст |
Ответ на | How to generate unique hash-type id? (Joe Kramer <cckramer@gmail.com>) |
Список | pgsql-general |
On 29/01/2010 4:20 PM, Joe Kramer wrote: > Hello, > > I need to generate unique id which is not guessable unlike > serial(integer) type. I need an id in format like md5 hash of random > number. > On top of that I need this id to be unique across multiple tables. > > Anyone had to solve this problem before? Can you post any recipes or > best practices please? > > My questions: > > 1. Avoiding collisions. > If I make an UNIQUE constraint and do generation of id triggered on > INSERT. What if collision happens? DO I nee d to check if unique hash > already exists and if not- regenerate. > This looks too primitive. Is there a readily available function or > methodology to do that? > > 2. Generating global unique id across multiple tables. > How to do that? My only idea is to have separate table to keep all > hashes and compare for collision against that table. > Is there a better way? Maybe by creating some special serial type that > is not integer but varchar? > > 3. what function to use to generate 64-bit random hash without much > overhead to CPU? When I ran into something somewhat akin to this I asked the list about a non-repeating pseudo-random mapping function. Daniel Verite (on this list) enlightened me about Feistel networks/cyphers, and even posted a PL/PgSQL implementation! It's documented here: http://wiki.postgresql.org/wiki/Pseudo_encrypt and has been extremely handy. It should fit you needs - just define a sequence that you pull new input values from, and use that same sequence across all tables that need unique values. It's not trivial to assert, across multiple tables, that a value is unique. You could do it with a trigger that checks each table that uses such values (slow-ish but effective) or maintains a side table of values in use. In either case it'd have to be added to every table that used the pseudo_encrypt values. By the way, if you intend to expose these to users you might also want some kind of data entry error checking so that a typo can't accidentally transform id `n' to id `m' with transposition of a single digit or the like. The luhn algorithm provides a good way to do that with a simple check digit - it's not cryptographically strong in that if you know the values are checked with the luhn algorithm it's trivial to re-generate the check digit, but it helps a log against casual scanning of the number space and against accidental user data entry error. Check out: http://wiki.postgresql.org/wiki/Luhn_algorithm -- Craig Ringer
В списке pgsql-general по дате отправления: