Re: [HACKERS] Brain-Dead Sort Algorithm??
От | Tom Lane |
---|---|
Тема | Re: [HACKERS] Brain-Dead Sort Algorithm?? |
Дата | |
Msg-id | 1451.944233399@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Brain-Dead Sort Algorithm?? ("Tim Perdue" <archiver@db.geocrawler.com>) |
Список | pgsql-hackers |
"Tim Perdue" <archiver@db.geocrawler.com> writes: > serial_half is a 1-column list of 10-digit > numbers. I'm doing a select distinct because I > believe there may be duplicates in that column. > The misunderstanding on my end came because > serial_half was a 60MB text file, but when it was > inserted into postgres, it became 345MB (6.8 > million rows has a lot of bloat apparently). The overhead per tuple is forty-something bytes, IIRC. So when the only useful data in a tuple is an int, the expansion factor is unpleasantly large. Little to be done about it though. All the overhead fields appear to be necessary if you want proper transaction semantics. > So the temp-sort space for 345MB could easily surpass the 1GB I had on > my hard disk. Yes, the merge algorithm used up through 6.5.* seems to have typical space usage of about 4X the actual data volume. I'm trying to reduce this to just 1X for 7.0, although some folks are complaining that the result is slower than before :-(. > Actually what was stated is that it is retarded to fill up a hard disk > and then hang instead of bowing out gracefully, Yup, that was a bug --- failure to check for write errors on the sort temp files. I believe it's fixed in current sources too. regards, tom lane
В списке pgsql-hackers по дате отправления: