Re: Estimating space required for indexes

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Estimating space required for indexes
Дата
Msg-id 18084.1051539021@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Estimating space required for indexes  (Shridhar Daithankar <shridhar_daithankar@persistent.co.in>)
Ответы Re: Estimating space required for indexes  (Shridhar Daithankar <shridhar_daithankar@persistent.co.in>)
Re: Estimating space required for indexes  (Oleg Bartunov <oleg@sai.msu.su>)
Список pgsql-general
Shridhar Daithankar <shridhar_daithankar@persistent.co.in> writes:
> Yesterday I was trying to upload a medium size terrain data dump to a
> postgresql database. (350 files, of sizes between 8-13MB each.. 3.3GB all in
> all). The load was done in 45 minutes.(7.3.2 is real fast for data loading is
> my feeling, anyway..). It took 4GB of disk space after upload.

> A tuple consists of 3 float values, x,y,z. I had to create a compound index on
> x and y. I started indexing it and killed it 1.5 hours later as it filled
> rest of the 5GB free disk upto point of choking.

AFAIK, a CREATE INDEX should require no more than twice the finished
index's size on disk.  I'm surprised that you were able to build the
index one way and not the other.

> How can I predict reasonably how much disk space I am going to need for such
> kind of indexing operation?

Assuming your "float"s were float4s, the heap tuple size is

    28 bytes overhead + 3 * 4 bytes data = 40 bytes/row

(assuming WITHOUT OIDS, no nulls, Intel-like alignment rules) while the
index tuple size is

    12 bytes overhead + 2 * 4 bytes data = 20 bytes/row

But this is not the whole story because heap pages are normally crammed
full while btree index pages are normally only filled 2/3rds full during
initial creation.  (Plus you have to allow for upper b-tree levels, but
with such small index entries that won't be much.)  So I'd have expected
the finished index to be about 3/4ths the size of the table proper.
I'm surprised you could fit it at all.

> This data is just a small sample of things and
> more data is coming.

Better buy more disk ...

            regards, tom lane


В списке pgsql-general по дате отправления:

Предыдущее
От: Jeremiah Jahn
Дата:
Сообщение: Re: > 16TB worth of data question
Следующее
От: David Link
Дата:
Сообщение: Re: pq_recvbuf: unexpected EOF