Re: many tables in db

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: many tables in db
Дата	19 июля 2001 г. 14:36:11
Msg-id	24822.995567160@sss.pgh.pa.us обсуждение исходный текст
Ответ на	many tables in db (Kenneth Been <kennethb@telocity.com>)
Список	pgsql-general

Дерево обсуждения

Kenneth Been <kennethb@telocity.com> writes:
> I am considering a db schema with many tables (in the
> thousands).

Just out of curiosity, why would you need so many tables?  Couldn't you
represent the same info in fewer tables, maybe with more key columns?
If every table has a different set of columns, that wouldn't work, but
I'm wondering what the organizing principle really is.

> 2. One of the reasons I am considering breaking the data
> into multiple tables is that I want to cluster a large table
> (on an rtree index)

This isn't really relevant to your main point, but: since an rtree
doesn't have an associated sort order, it's not clear to me that this
operation makes any sense.  Have you determined that you'll actually get
any performance improvement as a result of the clustering?  I suspect
you may find that you're just rearranging the table into a different
random order.

> 3. Another reason for splitting up the data is that I am
> worried about the size of the rtree.  I have some suspicion
> that the rtree size grows more than linearly with the size
> of the table.  Is that the case?

Dunno.  The rtree code isn't very well maintained, because none of
the current crop of developers uses it or knows much about it.
It could well have some bugs that cause it to waste space.  (I realized
only a few days ago that it probably behaves completely horribly when
asked to store NULLs, for example.  Do you have any NULLs in the column
you're indexing?)

> 4. Another possible reason is query speed.  Any guesses on
> which would be faster, two or three queries on tables of
> about 40K records, or one query on a table of about 10M
> records?  (All queries are geometric range searches on an
> rtree index.)

Given the size difference, I'd suppose that the 10M table would actually
be the equivalent of a materialized join of the smaller tables?  I'd
guess that recomputing the join on the fly is better because it takes
so much less I/O, but that's just a guess.  You'd be well advised to
test both ways and see.

            regards, tom lane

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: many tables in db