Re: UUID performance as primary key

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: UUID performance as primary key
Дата
Msg-id 4CB91515.6020201@postnewspapers.com.au
обсуждение исходный текст
Ответ на UUID performance as primary key  (Navkirat Singh <navkirats@gmail.com>)
Ответы Re: UUID performance as primary key  (Andy <angelflow@yahoo.com>)
Re: UUID performance as primary key  (Merlin Moncure <mmoncure@gmail.com>)
Список pgsql-performance
On 16/10/2010 9:58 AM, Navkirat Singh wrote:
> Hi Guys,
>
> I am interested in finding out the pros/cons of using UUID as a primary key field. My requirement states that UUID
wouldbe perfect in my case as I will be having many small databases which will link up to a global database using the
UUID.Hence, the need for a unique key across all databases. It would be extremely helpful if someone could help me
figurethis out, as it is critical for my project. 

Pro: No need for (serverid,serverseq) pair primary keys or hacks with
modulus based key generation. Doesn't set any pre-determined limit on
how many servers/databases may be in a cluster.

Con: Slower than modulo key generation approach, uses more storage.
Foreign key relationships may be slower too.

Overall, UUIDs seem to be a favoured approach. The other way people seem
to do this is by assigning a unique instance id to each server/database
out of a maximum "n" instances decided at setup time. Every key
generation sequence increments by "n" whenever it generates a key, with
an offset of the server/database id. That way, if n=100, server 1 will
generate primary keys 001, 101, 201, 301, ..., server 2 will generate
primary keys 002, 102, 202, 302, ... and so on.

That works great until you need more than 100 instances, at which point
you're really, REALLY boned. In really busy systems it also limits the
total amount of primary key space - but with BIGINT primary keys, that's
unlikely to be something you need to worry about.

The composite primary key (serverid,sequenceid) approach avoids the need
for a pre-defined maximum number of servers, but can be slow to index
and can require more storage, especially because of tuple headers.

I have no firsthand experience with any of these approaches so I can't
offer you a considered opinion. I know that the MS-SQL crowd at least
strongly prefer UUIDs, but they have very strong in-database UUID
support. MySQL folks seem to mostly favour the modulo primary key
generation approach. I don't see much discussion of the issue here - I
get the impression Pg doesn't see heavy use in sharded environments.

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Stored procedure declared as VOLATILE => no good optimization is done
Следующее
От: Greg Smith
Дата:
Сообщение: Re: Slow count(*) again...