Re: Performance on inserts
От | Jules Bean |
---|---|
Тема | Re: Performance on inserts |
Дата | |
Msg-id | 20000826175736.A3611@pear.presence.net.uk обсуждение исходный текст |
Ответ на | Re: Performance on inserts (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
On Sat, Aug 26, 2000 at 12:32:20PM -0400, Tom Lane wrote: > Jules Bean <jules@jellybean.co.uk> writes: > > Is there any chance you could generate a patch against released 7.0.2 > > to add just this functionality... It would be the kiss of life for my > > code! > > Will look at it. Are you brave enough to want to try the rest of the > 7.1 rewrite of the btree code, or do you just want this one hack? Hmm. I don't know :-) because I don't really know the scope of the changes... I do need this database to be stable, but OTOH, changes to the index code are unlikely to corrupt my data. > > I don't think anyone was envisioning "automagic" drop of most common > values. The partial-index support that's partially there ;-) is > designed around manual specification of a predicate, ie, you'd say > > CREATE INDEX mypartialindex ON table (column) > WHERE column != 42 AND column != 1066 Fair enough. > > > For the short term, if I can get a working version of the above > > randomisation patch, I think I shall 'fake' a partial index by > > manually setting 'enable_seqscan=off' for all but the 4 or 5 most > > common categories. Those two factors combined will speed up my bulk > > inserts a lot. > > Uh, enable_seqscan has nothing to do with how inserts are handled... Um, that was a thinko! What I was trying to say is: the randomisation will speed up my bulk inserts, whereas enable_seqscan=off will speed up the other slow queries in my application, namely "SELECT * where category='foo'" type queries. > > Specifically, their effect on the index files. > > This particular problem couldn't be cured by batching inserts anyway. > The performance problem was coming from the actual act of inserting > a key (or more specifically, making room for the key) and that's just > got to be done for each key AFAICS. In principle, it could be helpful to know you're inserting 10000 tuples with category='xyz'. Then you only make the left-to-right scan once, and you fill up every hole you see, before finally splitting the last page and inserting approx (10000)/(num_per_page) new pages all at once. This is surely quicker that doing the left-to-right scan for each tuple (even though the difference would be far less noticeable in the presence of your probablistic patch). Jules -- Jules Bean | Any sufficiently advanced jules@debian.org | technology is indistinguishable jules@jellybean.co.uk | from a perl script
В списке pgsql-hackers по дате отправления: