Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap
От | Jameison Martin |
---|---|
Тема | Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap |
Дата | |
Msg-id | 1344527774.12166.YahooMailNeo@web39404.mail.mud.yahoo.com обсуждение исходный текст |
Ответ на | Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: patch submission: truncate trailing nulls from heap
rows to reduce the size of the null bitmap
(Jim Nasby <jim@nasby.net>)
Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
<div style="color:; background-color:; font-family:tahoma, new york, times, serif;font-size:10pt"><div style="font-family:tahoma, 'new york', times, serif; font-size: 10pt; "><span>Simon, Tom is correct, the patch doesn't changethe existing row format contract or the format of the null bitmap. The change only affects how new rows are writtenout. And it uses the same supported format that has always been there (which is why alter table add col null worksthe way it does). And it keeps to the same MAXALIGN boundaries that are there today. </span></div><div style="font-family:tahoma, 'new york', times, serif; font-size: 13px; color: rgb(0, 0, 0); background-color: transparent;font-style: normal; "><span><br /></span></div><div style="background-color: transparent; "><span><font size="2">Onecould argue that different row formats could make sense in different circumstances, and I'm certainly open tothat kind of discussion, but this change is far more modest and perhaps can be made on its own since it doesn't perturb thecode base much, improves performance (marginally) and improves the size of rows with lots of trailingnulls.</font></span></div><div style="background-color: transparent; color: rgb(0, 0, 0); font-size: 13px; font-family:tahoma, 'new york', times, serif; font-style: normal; "><span><font size="2"><br /></font></span></div><div style="background-color:transparent; color: rgb(0, 0, 0); font-size: 13px; font-family: tahoma, 'new york', times, serif;font-style: normal; "><span><font size="2">[separate topic: pluggable heap manager]</font></span></div><div style="background-color:transparent; color: rgb(0, 0, 0); font-size: 13px; font-family: tahoma, 'new york', times, serif;font-style: normal; "><span><font size="2">I'm quite interested in pursuing more aggressive compression strategies,and I'd like to do so in the context of the heap manager. I'm exploring having a pluggable heap manager implementationand would be interested in feedback on that as a general approach. My thinking is that I'd like to be ableto have PostgreSQL support multiple heap implementations along the lines of how multiple index types are supported, thoughprobably only the existing heap manager implementation would be part of the actual codeline. I've done a little exploratorywork of looking at the heap interface. I was planning on doing a little prototyping before suggesting anythingconcrete, but, assuming the concept of a layered heap manager is not inherently objectionable, I was thinking ofcleaning up the heap interface a little (e.g. some HOT stuff has bled across a little), then taking a whack at formalizingthe interface along the lines of the index layering. So ideally I'd make a few separate submissions and if allgoes according to plan I'd be able to have a pluggable heap manager implementation that I could work on independentlyand which could in theory use the same hooks as the existing heap implementation. And if it turns out that myimplementation is deemed to be general enough it could be released to the community.</font></span></div><div style="background-color:transparent; color: rgb(0, 0, 0); font-size: 13px; font-family: tahoma, 'new york', times, serif;font-style: normal; "><span><font size="2"><br /></font></span></div><div style="background-color: transparent; color:rgb(0, 0, 0); font-family: tahoma, 'new york', times, serif; font-style: normal; "><font size="2">If I do decide topursue this, can anyone suggest the best way solicit feedback? I see that some proposals get shared on the postgres wiki.I could put something up there to frame the issue and encourage some back and forth dialog. Or is email the way thatthis kind of exchange tends to happen? Ultimately I'd like to get into a bit of detail about what the actual heap managercontract is and so forth.</font></div><div style="background-color: transparent; color: rgb(0, 0, 0); font-family:tahoma, 'new york', times, serif; font-style: normal; font-size: 13px; "><font size="2"><br /></font></div><divstyle="background-color: transparent; color: rgb(0, 0, 0); font-family: tahoma, 'new york', times, serif;font-style: normal; font-size: 13px; "><font size="2">Note that I'm a ways from really knowing if this is feasibleon my end, so this is quite speculative at this point. But I'd like to introduce the topic and get some feedbackon the right way to communicate as early as possible.</font></div><div style="background-color: transparent; color:rgb(0, 0, 0); font-family: tahoma, 'new york', times, serif; font-style: normal; font-size: 13px; "><font size="2"><br/></font></div><div style="background-color: transparent; color: rgb(0, 0, 0); font-family: tahoma, 'new york',times, serif; font-style: normal; font-size: 13px; "><font size="2">Thanks.</font></div><div style="background-color:transparent; color: rgb(0, 0, 0); font-family: tahoma, 'new york', times, serif; font-style: normal;font-size: 13px; "><font size="2"><br /></font></div><div style="background-color: transparent; color: rgb(0, 0, 0);font-family: tahoma, 'new york', times, serif; font-style: normal; font-size: 13px; "><font size="2">-Jamie</font></div><divstyle="background-color: transparent; color: rgb(0, 0, 0); font-family: tahoma, 'new york',times, serif; font-style: normal; font-size: 13px; "><font size="2"><br /></font></div><div style="font-family: tahoma,'new york', times, serif; font-size: 10pt; "><div style="font-family: 'times new roman', 'new york', times, serif;font-size: 12pt; "><div dir="ltr"><font face="Arial" size="2"><hr size="1" /><b><span style="font-weight:bold;">From:</span></b>Tom Lane <tgl@sss.pgh.pa.us><br /><b><span style="font-weight: bold;">To:</span></b>Simon Riggs <simon@2ndQuadrant.com> <br /><b><span style="font-weight: bold;">Cc:</span></b> JameisonMartin <jameisonb@yahoo.com>; "pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org> <br /><b><spanstyle="font-weight: bold;">Sent:</span></b> Thursday, August 9, 2012 7:27 AM<br /><b><span style="font-weight:bold;">Subject:</span></b> Re: [HACKERS] patch submission: truncate trailing nulls from heap rows to reducethe size of the null bitmap<br /></font></div><br /> Simon Riggs <<a href="mailto:simon@2ndQuadrant.com" ymailto="mailto:simon@2ndQuadrant.com">simon@2ndQuadrant.com</a>>writes:<br />> On 17 April 2012 17:22, Jameison Martin<<a href="mailto:jameisonb@yahoo.com" ymailto="mailto:jameisonb@yahoo.com">jameisonb@yahoo.com</a>> wrote:<br/>>> The following patch truncates trailing null attributes from heap rows to<br />>> reduce the sizeof the row bitmap.<br /><br />> This is an interesting patch, but its has had various comments made about it.<br /><br/>> When I look at this I see that it would change the NULL bitmap for all<br />> existing rows, which means itforces a complete unload/reload of data.<br /><br />Huh? I thought it would only change how *new* tuples were stored.<br/>Old tuples ought to continue to work fine.<br /><br />I'm not really convinced that it's a good idea in the largerscheme<br />of things --- your point in a nearby thread that micro-optimizing<br />storage space at the expense ofall else is not good engineering<br />applies here. But I don't see that it forces data reload. Or if<br />it does, thatshould be easily fixable.<br /><br />> ... Have another flag which indicates<br />> when a partial trailing coltrimmed NULL bitmap is in use.<br /><br />That might be useful for forensic purposes, but on the whole I suspect<br />it'sjust added complexity (and eating up a valuable infomask bit)<br />for relatively little gain.<br /><br />> ...decide whether a table will benefit from full or partial bitmap and<br />> set that in the tupledesc. That way thetupledesc will show<br />> heap_form_tuple which kind of null bitmap is preferred for new tuples.<br />> That preferencemight be settable by user on or off, but the default<br />> would be for postgres to decide that for us basedupon null stats etc,<br />> which we would decide at ANALYZE time.<br /><br />And that seems like huge overcomplication. I think we could probably<br />do fine with some very simple fixed policy, like "don't bother with<br />thisfor tables of less than N columns", where N is maybe 64 or so<br />and chosen to match the MAXALIGN boundary wherethere actually could<br />be some savings from trimming the null bitmap.<br /><br />(Note: I've not read the patch,so maybe Jameison already did something<br />of the sort.)<br /><br /> regards, tom lane<br /><br /><br/></div></div></div>
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Alexander KorotkovДата:
Сообщение: Re: SP-GiST for ranges based on 2d-mapping and quad-tree