Re: [HACKERS] mdnblocks is an amazing time sink in huge relations
От | Tom Lane |
---|---|
Тема | Re: [HACKERS] mdnblocks is an amazing time sink in huge relations |
Дата | |
Msg-id | 20878.940204380@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | RE: [HACKERS] mdnblocks is an amazing time sink in huge relations ("Hiroshi Inoue" <Inoue@tpf.co.jp>) |
Ответы |
RE: [HACKERS] mdnblocks is an amazing time sink in huge relations
|
Список | pgsql-hackers |
(Sorry for slow response, I've been off chasing psort problems...) "Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > I have been suspicious about current implementation of md.c. > It relies so much on information about existent phisical files. Yes, but on the other hand we rely completely on those same physical files to hold our data ;-). I don't see anything fundamentally wrong with using the existence and size of a data file as useful information. It's not a substitute for a lock, of course, and there may be places where we need cross-backend interlocks that we haven't got now. > How do you think about the following ? > > 1. Partial blocks(As you know,I have changed the handling of this > kind of blocks recently). Yes. I think your fix was good. > 2. If a backend was killed or crashed in the middle of execution of > mdunlink()/mdtruncate(),half of segments wouldn't be unlink/ > truncated. That's bothered me too. A possible answer would be to do the unlinking back-to-front (zap the last file first); that'd require a few more lines of code in md.c, but a crash midway through would then leave a legal file configuration that another backend could still do something with. > 3. In cygwin port,mdunlink()/mdtruncate() may leave segments of 0 > length. I don't understand what causes this. Can you explain? BTW, I think that having the last segment be 0 length is OK and indeed expected --- mdnblocks will create the next segment as soon as it notices the currently last segment has reached RELSEG_SIZE, even if there's not yet a disk page to put in the next segment. This seems OK to me, although it's not really necessary. > 4. We couldn't mdcreate() existent files and coudn't mdopen()/md > unlink() non-existent files. So there are some cases that we > could neither CREATE TABLE nor DROP TABLE. True, but I think this is probably the best thing for safety's sake. It seems to me there is too much risk of losing or overwriting valid data if md.c bulls ahead when it finds an unexpected file configuration. I'd rather rely on manual cleanup if things have gotten that seriously out of whack... (but that's just my opinion, perhaps I'm in the minority?) regards, tom lane
В списке pgsql-hackers по дате отправления: