Re: [PERFORM] CLUSTER command
От | Stephan Szabo |
---|---|
Тема | Re: [PERFORM] CLUSTER command |
Дата | |
Msg-id | 20021212175208.B15052-100000@megazone23.bigpanda.com обсуждение исходный текст |
Ответ на | Re: [PERFORM] CLUSTER command (Alvaro Herrera <alvherre@dcc.uchile.cl>) |
Список | pgsql-general |
On Thu, 12 Dec 2002, Alvaro Herrera wrote: > On Thu, Dec 12, 2002 at 04:03:47PM -0800, Stephan Szabo wrote: > > On Thu, 12 Dec 2002, johnnnnnn wrote: > > > > > I think the code changes would be complicated. Just at a 30-second > > > consideration, this would need to touch: > > > - all sql (selects, inserts, updates, deletes) > > > - vacuuming > > > - indexing > > > - statistics gathering > > > - existing clustering > > > > I think his idea was to treat it similarly to the way that the > > system treats tables >2G with .N files. The only thing is that > > I believe the code that deals with that wouldn't be particularly > > easy to change to do it though, but I've only taken a cursory look at > > what I think is the place that does that(storage/smgr/md.c). Some sort of > > good partitioning system would be nice though. > > I don't think this is doable without a huge amount of work. The storage > manager doesn't know anything about what is in a page, let alone a > tuple. And it shouldn't, IMHO. Upper levels don't know how are pages > organized in disk; they don't know about .1 segments and so on, and they > shouldn't. Which is part of why I said it wouldn't be easy to change to do that, there's no good way to communicate that information. Like I said, I didn't look deeply, but I had to look though, because you can never tell with bits of old university code to do mostly what you want that haven't been exercised in years floating around. > I think this kind of partition doesn't buy too much. I would really > like to have some kind of auto-clustering, but it should be implemented > in some upper level; e.g., by leaving some empty space in pages for > future tuples, and arranging the whole heap again when it runs out of > free space somewhere. Note that this is very far from the storage > manager. Auto clustering would be nice. I think Jean-Luc's suggested partitioning mechanism has certain usage patterns that it's a win for and most others that it's not. Since the usage pattern I can think of (very large table with a small number of breakdowns where your conditions are primarily on those breakdowns) aren't even remotely in the domain of things I've worked with, I can't say whether it'd end up really being a win to avoid the index reads for the table.
В списке pgsql-general по дате отправления: