Re: Thousands of tables versus on table?

Поиск

Список

Период

Сортировка

От	Thomas Andrews
Тема	Re: Thousands of tables versus on table?
Дата	4 июня 2007 г. 19:03:00
Msg-id	46645DD9.7030805@soliantconsulting.com обсуждение исходный текст
Ответ на	Re: Thousands of tables versus on table? (Mark Lewis <mark.lewis@mir3.com>)
Ответы	Re: Thousands of tables versus on table? (Gregory Stark <stark@enterprisedb.com>)
Список	pgsql-performance

Дерево обсуждения

We're running 7.4 but will be upgrading to 8.2.

The responses table has 20,000,000 records.

Sometimes (but not all the time) an insert into the responses table can
take 5-6 seconds.

I guess my real question is, does it ever make sense to create thousands
of tables like this?

=thomas

Mark Lewis wrote:
> On Mon, 2007-06-04 at 13:40 -0400, Thomas Andrews wrote:
>> I have several thousand clients.  Our clients do surveys, and each survey
>> has two tables for the client data,
>>
>>    responders
>>    responses
>>
>> Frequent inserts into both table.
>>
>> Right now, we are seeing significant time during inserts to these two
>> tables.
>
> Can you provide some concrete numbers here?  Perhaps an EXPLAIN ANALYZE
> for the insert, sizes of tables, stuff like that?
>
>> Some of the indices in tableA and tableB do not index on the client ID
>> first.
>
> What reason do you have to think that this matters?
>
>> So, we are considering two possible solutions.
>>
>>  (1) Create separate responders and responses tables for each client.
>>
>>  (2) Make sure all indices on responders and responses start with the
>>   client id (excepting, possibly, the primary keys for these fields) and
>>   have all normal operation queries always include an id_client.
>>
>> Right now, for example, given a responder and a survey question, we do a
>> query in responses by the id_responder and id_survey.  This gives us a
>> unique record, but I'm wondering if maintaining the index on
>> (id_responder,id_survey) is more costly on inserts than maintaining the
>> index (id_client,id_responder,id_survey) given that we also have other
>> indices on (id_client,...).
>>
>> Option (1) makes me very nervous.  I don't like the idea of the same sorts
>> of data being stored in lots of different tables, in part for long-term
>> maintenance reasons.  We don't really need cross-client reporting, however.
>
> What version of PG is this?  What is your vacuuming strategy?  Have you
> tried a REINDEX to see if that helps?
>
> -- Mark Lewis
>

Вложения

tandrews.vcf

В списке pgsql-performance по дате отправления:

Предыдущее

От: Markus Schiltknecht
Дата: 04 июня 2007 г., 19:02:24
Сообщение: Re: dbt2 NOTPM numbers

Следующее

От: Gregory Stark
Дата: 04 июня 2007 г., 19:44:35
Сообщение: Re: Thousands of tables versus on table?

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Thousands of tables versus on table?

Вложения

Предыдущее

Следующее