Обсуждение: Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM

Поиск
Список
Период
Сортировка

Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM

От
Peter Geoghegan
Дата:
My recent commit 44fa8488 made vacuumlazy.c always scan the last page
of every heap relation, even when it's all-visible, regardless of any
other factor. The general idea is to always examine the last page in
the relation when that might prevent us from unsuccessfully attempting
to truncate the relation -- since even the attempt is costly (in
particular, we have to get an AccessExclusiveLock, which is painful
with a Hot Standby replica). This created a new problem: if we
manually VACUUM a smaller, static table many times,
vac_estimate_reltuples() will consider the same scanned_page to be a
random sample each time. Of course this page is the further possible
thing from a random sample, because it's the same heap page each time,
and because it's generally more likely that the last page will have
fewer tuples.

Every time you run VACUUM (without changing the table),
pg_class.reltuples shrinks, by just a small amount. Run VACUUM like
that enough times, and pg_class.reltuples will eventually become quite
distorted. It's easy to spot this effect, just by running "VACUUM
VERBOSE" in a loop against a static table, and noting how "tuples:
.... ??? remain" tends to shrink over time, even though the contents
of the table won't have changed.

The "always scan last page" behavior isn't really new, actually. It's
closely related to an earlier, slightly different behavior with the
same goal -- though one that conditioned scanning the last page on
certain other factors that were protective (which was probably never
considered by the design). This closely related behavior (essentially
the same behavior) was added by commit e8429082 from 2015. I judged
that it wasn't worth worrying about scanning an extra page
unnecessarily (better to keep things simple), since it's not at all
uncommon for VACUUM to scan a few all-visible pages anyway (due to the
SKIP_PAGES_THRESHOLD stuff). And I haven't changed my mind about that
-- I still think that we should just scan the last page in all cases,
to keep things simple. But I definitely don't think that this
"pg_class.reltuples gets distorted over time" business is okay -- that
has to be fixed.

There has been at least one very similar bug in the past: the commit
message from commit b4b6923e (from 2011) talks about a similar issue
with over-sampling from the beginning of the table (not the end), due
to a similar bias involving visibility map implementation details that
lead to an extremely nonrandom scanned_pages sample. To me that
suggests that the true problem here isn't really in vacuumlazy.c --
going back to the old approach (i.e. scanning the last page
conditionally) seems like a case of the tail wagging the dog. IMV the
real problem is that vac_estimate_reltuples() is totally unconcerned
about these kinds of systematic sampling problems.

Attached draft patch fixes the issue by making
vac_estimate_reltuples() return the old/original reltuples from
pg_class (and not care at all about the scanned_tuples value from its
vacuumlazy.c caller) when:

1. The total_pages for the table hasn't changed (by even one page)
since the last VACUUM (or last ANALYZE).

and,

2. The total number of scanned_pages is less than 2% of total_pages.

This fixes the observed problem directly. It also makes the code
robust against other similar problems that might arise in the future.
The risk that comes from trusting that scanned_pages is a truly random
sample (given these conditions) is generally very large, while the
risk that comes from disbelieving it (given these same conditions) is
vanishingly small.

-- 
Peter Geoghegan

Вложения

Re: Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM

От
Peter Geoghegan
Дата:
On Sun, Feb 13, 2022 at 1:43 PM Peter Geoghegan <pg@bowt.ie> wrote:
> This fixes the observed problem directly. It also makes the code
> robust against other similar problems that might arise in the future.
> The risk that comes from trusting that scanned_pages is a truly random
> sample (given these conditions) is generally very large, while the
> risk that comes from disbelieving it (given these same conditions) is
> vanishingly small.

Pushed.

-- 
Peter Geoghegan



Re: Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM

От
Andres Freund
Дата:
Hi,

On 2022-02-16 17:16:13 -0800, Peter Geoghegan wrote:
> On Sun, Feb 13, 2022 at 1:43 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > This fixes the observed problem directly. It also makes the code
> > robust against other similar problems that might arise in the future.
> > The risk that comes from trusting that scanned_pages is a truly random
> > sample (given these conditions) is generally very large, while the
> > risk that comes from disbelieving it (given these same conditions) is
> > vanishingly small.
>
> Pushed.

I'm quite worried about the pace and style of the vacuum changes. To me it
doesn't look like that there was design consensus before 44fa8488 was
committed, quite the opposite (while 44fa8488 probably isn't the center of
contention, it's not just off to the side either).

You didn't really give a heads up that you're about to commit either. There's
a note at the bottom of [1], a very long email, talking about committing in a
couple of weeks. After which there was substantial discussion of the patchset.

It doesn't look to me like there was a lot of review for 44fa8488, despite it
touching very critical parts of the code. I'm listed as a reviewer, that was a
few months ago, and rereading my review I don't think it can be read to agree
with the state of the code back then.

I also have a hard time making heads or tails out of the commit message of
44fa84881ff. It's quite long without being particularly descriptive. The
commit just changes a lot of things at once, making it hard to precisely
describe and very hard to review and understand.


Then you just committed 74388a1ac36, three days after raising it. Without any
sort of heads up. You talked about a "draft patch".


It'd be one thing if this was happening in isolation or that nobody complained
about this before. But see among others [2], [3], [4], [5]


And yes, it'd have been nice for 44fa8488' thread to have attracted more
reviews. But as far as complicated patchsets go, it has actually gotten more
attention than most of a similar vintage. Engaging with you around vacuum is
time and energy intensive, there tend to be lots of very long
emails. Reviewers have their own work as well.


I'm miffed.

- Andres

[1] https://www.postgresql.org/message-id/CAH2-Wz%3DiLnf%2B0CsaB37efXCGMRJO1DyJw5HMzm7tp1AxG1NR2g%40mail.gmail.com
[2] https://www.postgresql.org/message-id/20210414193310.4yk7wriswhqhcxvq%40alap3.anarazel.de
[3] https://www.postgresql.org/message-id/CA%2BTgmobyUxq2Ms3g5YMPgqJzNOi7BmStcFGwCNd-W7z8nxbjUA%40mail.gmail.com
[4] https://www.postgresql.org/message-id/CA+TgmoYdp4qy2LFW7yCi6yFJ1m-RCH+npctJQd29k6d6uCx0ww@mail.gmail.com
[5] https://www.postgresql.org/message-id/CA+TgmobkCExv7Zo+pArRCsJ16hYVJJB_VuZKwvk1cCP=rS1Qbw@mail.gmail.com



Re: Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM

От
Peter Geoghegan
Дата:
On Wed, Feb 16, 2022 at 7:08 PM Andres Freund <andres@anarazel.de> wrote:
> I'm quite worried about the pace and style of the vacuum changes. To me it
> doesn't look like that there was design consensus before 44fa8488 was
> committed, quite the opposite (while 44fa8488 probably isn't the center of
> contention, it's not just off to the side either).

I'm not aware of any disagreement whatsoever that is directly related
to 44fa8488. It's true that there was a lot of discussion (even heated
discussion) between myself and Robert, concerning related work on
freezing. But the work in commit 44fa8488 can clearly be justified as
refactoring work. It makes the code in vacuumlazy.c much cleaner. In
fact, that's how commit 44fa8488 started off -- purely as refactoring
work. All of the stuff with freezing (not committed) was started after
the general shape of 44fa8488 was established. One thing followed from
the other.

> You didn't really give a heads up that you're about to commit either. There's
> a note at the bottom of [1], a very long email, talking about committing in a
> couple of weeks. After which there was substantial discussion of the patchset.

How can you be surprised that I committed 44fa8488? It's essentially
the same patch as the first version, posted November 22 -- almost 3
months ago. And it's certainly not a big patch (though it is
complicated).

How was 44fa8488 substantively different to v1 of the patch, posted
all the way back on November 22? Literally the only thing that changed
is that it became more polished, based in part on your feedback.
That's it!

> It doesn't look to me like there was a lot of review for 44fa8488, despite it
> touching very critical parts of the code. I'm listed as a reviewer, that was a
> few months ago, and rereading my review I don't think it can be read to agree
> with the state of the code back then.

Can you be more specific?

> I also have a hard time making heads or tails out of the commit message of
> 44fa84881ff. It's quite long without being particularly descriptive. The
> commit just changes a lot of things at once, making it hard to precisely
> describe and very hard to review and understand.

The commit message is high level. The important point is that we can
more or less treat all scanned_pages the same, without generally
caring about whether or not a cleanup lock could be acquired (since
the exceptions where that isn't quite true are narrow and rare). That
is the common theme, for everything.

I do recall that you said something about the patch that became commit
44fa8488 being too much all at once. In particular, something about
removing FORCE_CHECK_PAGE() in a separate commit. But I don't think
that that made sense -- which I said at the time. There were subtle
interactions between that and between the other stuff -- which is
actually what led to this 74388a1ac36 fix-up. But as I said in the
commit message to 74388a1ac36, I think that this issue might have been
latent. If it wasn't there all along, then it might have been. Sort of
like the issue reference in 2011 commit b4b6923e.

-- 
Peter Geoghegan



Re: Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM

От
Justin Pryzby
Дата:
On Wed, Feb 16, 2022 at 07:43:09PM -0800, Peter Geoghegan wrote:
> > I also have a hard time making heads or tails out of the commit message of
> > 44fa84881ff. It's quite long without being particularly descriptive. The
> > commit just changes a lot of things at once, making it hard to precisely
> > describe and very hard to review and understand.
> 
> The commit message is high level. The important point is that we can
> more or less treat all scanned_pages the same, without generally
> caring about whether or not a cleanup lock could be acquired (since
> the exceptions where that isn't quite true are narrow and rare). That
> is the common theme, for everything.

As for myself, I particularly appreciated the clarity and detail of this commit
message.  (I have looked at the associated change a bit, but not deeply).

Thanks,
-- 
Justin



Re: Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM

От
Robert Haas
Дата:
On Wed, Feb 16, 2022 at 10:43 PM Peter Geoghegan <pg@bowt.ie> wrote:
> How can you be surprised that I committed 44fa8488? It's essentially
> the same patch as the first version, posted November 22 -- almost 3
> months ago. And it's certainly not a big patch (though it is
> complicated).

Let's back up a minute and talk about the commit of $SUBJECT. The
commit message contains a Discussion link to this thread. This thread,
at the time you put that link in there, had exactly one post: from
you. That's not much of a discussion, although I do acknowledge that
sometimes we commit things that have bugs, and those bugs need to be
fixed even if nobody has responded.

That brings us to the question of whether 44fa8488 was improvidently
committed. I don't know the answer to that question, and here's why:

> The commit message is high level.

I would say it differently: I think the commit message does a poor job
describing what the commit actually does. For example, it says nothing
about changing VACUUM to always scan the last page of every heap
relation. This whole thread is about fixing a problem that was caused
by a significant behavior change that was *not even mentioned* in the
original commit message. If it had been mentioned, I likely would have
complained, because it's very similar to behavior that Tom eliminated
in b503da135ab0bdd97ac3d3f720c35854e084e525, which he did because it
was distorting reltuples estimates.

Commit messages need to describe what the commit actually changes.
Theoretical ideas are fine, but if I, as a committer who have done
significant work in this area in the past, can't read the commit
message and understand what is actually different, it's not a good
commit message. I think you *really* need to put more effort into
making your patches, and the emails about your patches, and the commit
messages for your patches understandable to other people. Otherwise,
waiting 3 months between when you post the patch and when you commit
it means nothing. You can wait 10 years to commit and still get
objections, if other people don't understand what you're doing.

I would guess that's really the root of Andres's concern here. I
believe that both Andres and I are in favor of the kinds of things you
want to do here *in principle*. But in practice I feel like it's not
working well, and thereby putting the project at risk. What if some
day one of us needs to fix a bug in your code? It's not like VACUUM is
some peripheral system where bugs aren't that critical -- and it's
also not the case that the risk of introducing new bugs is low.
Historically, it's anything but.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM

От
Andres Freund
Дата:
Hi,

On 2022-02-17 09:17:04 -0500, Robert Haas wrote:
> Commit messages need to describe what the commit actually changes.
> Theoretical ideas are fine, but if I, as a committer who have done
> significant work in this area in the past, can't read the commit
> message and understand what is actually different, it's not a good
> commit message. I think you *really* need to put more effort into
> making your patches, and the emails about your patches, and the commit
> messages for your patches understandable to other people. Otherwise,
> waiting 3 months between when you post the patch and when you commit
> it means nothing. You can wait 10 years to commit and still get
> objections, if other people don't understand what you're doing.
>
> I would guess that's really the root of Andres's concern here.

Yes.


> I believe that both Andres and I are in favor of the kinds of things you
> want to do here *in principle*.

Yea. And I might even agree more with Peter than with you on some of the
contended design bits. But it's too hard to know right now, because too many
things are changed at once and the descriptions are high level. And often
about some vague ideal principles that are nearly impossible to concretely
disagree with.


> But in practice I feel like it's not working well, and thereby putting the
> project at risk. What if some day one of us needs to fix a bug in your code?
> It's not like VACUUM is some peripheral system where bugs aren't that
> critical -- and it's also not the case that the risk of introducing new bugs
> is low.  Historically, it's anything but.

+1

Greetings,

Andres Freund



Re: Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM

От
Andres Freund
Дата:
Hi,

On 2022-02-16 19:43:09 -0800, Peter Geoghegan wrote:
> It makes the code in vacuumlazy.c much cleaner. In fact, that's how commit
> 44fa8488 started off -- purely as refactoring work.

The problem is that it didn't end up as that. You combined refactoring with
substantial changes. And described it large and generic terms.


> > You didn't really give a heads up that you're about to commit either. There's
> > a note at the bottom of [1], a very long email, talking about committing in a
> > couple of weeks. After which there was substantial discussion of the patchset.
>
> How can you be surprised that I committed 44fa8488? It's essentially
> the same patch as the first version, posted November 22 -- almost 3
> months ago.

It's a contentious thread. I asked for things to be split. In that context,
it's imo common curtesy for non-trivial patches to write something like

  While the rest of the patchset is contentious, I think 0001 is ready to
  go. I'd like to commit it tomorrow, unless somebody protests.


> And it's certainly not a big patch (though it is complicated).

For me a vacuum change with the following diffstat is large:
3 files changed, 515 insertions(+), 297 deletions(-)


> > It doesn't look to me like there was a lot of review for 44fa8488, despite it
> > touching very critical parts of the code. I'm listed as a reviewer, that was a
> > few months ago, and rereading my review I don't think it can be read to agree
> > with the state of the code back then.
>
> Can you be more specific?

Most importantly:

On 2021-11-22 11:29:56 -0800, Andres Freund wrote:
> I think this is a change mostly in the right direction. But as formulated this
> commit does *WAY* too much at once.

I do not know how to state more clearly that I think this is not a patch in a
committable shape.

but also:

On 2021-11-22 11:29:56 -0800, Andres Freund wrote:
> I think it should be doable to add an isolation test for this path. There have
> been quite a few bugs around the wider topic...


Greetings,

Andres Freund



Re: Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM

От
Peter Geoghegan
Дата:
On Thu, Feb 17, 2022 at 6:17 AM Robert Haas <robertmhaas@gmail.com> wrote:
> Let's back up a minute and talk about the commit of $SUBJECT. The
> commit message contains a Discussion link to this thread. This thread,
> at the time you put that link in there, had exactly one post: from
> you. That's not much of a discussion, although I do acknowledge that
> sometimes we commit things that have bugs, and those bugs need to be
> fixed even if nobody has responded.

All true. I don't like to leave bugs unfixed for very long. I'm happy
to revise the fix in light of new information, or opinions.

> I would say it differently: I think the commit message does a poor job
> describing what the commit actually does. For example, it says nothing
> about changing VACUUM to always scan the last page of every heap
> relation. This whole thread is about fixing a problem that was caused
> by a significant behavior change that was *not even mentioned* in the
> original commit message.

It's not that simple. As I said in the fix-up commit message, and in
the opening email to this thread, it basically isn't a new behavior at
all. It would be much more accurate to describe it as a behavior that
originated with commit e8429082, from late 2015. There was a subtle
interaction with that existing behavior, and the refactoring work,
that caused the reltuples problem.

> If it had been mentioned, I likely would have
> complained, because it's very similar to behavior that Tom eliminated
> in b503da135ab0bdd97ac3d3f720c35854e084e525, which he did because it
> was distorting reltuples estimates.

I cited that commit myself. I think that it's a good idea to not be
completely unconcerned about how the visibility map skipping behavior
tends to affect the reltuples estimate -- especially if it's easy to
do it in a way that doesn't cause problems anyway. And so I agree that
the "looks ahead rather than behind" behavior added by that commit
makes sense. Even still, the whole idea that scanned_pages is a random
sample from the table is kinda bogus, and always has been. The
consequences of this are usually not too bad, but even still: why
wouldn't it be possible for there to be all kinds of distortion to
reltuples, since the tuple density logic is built on an assumption
that's self evidently wrong?

It's not reasonable to expect vacuumlazy.c to truly contort itself,
just to conceal that shaky assumption. At the same time, we need the
model used by vac_estimate_reltuples() to work as well as it can, with
the limited information that it has close at hand. That doesn't seem
to leave a lot of options, as far as addressing the bug goes. Happy to
discuss it further if you have any ideas, though.

> I think you *really* need to put more effort into
> making your patches, and the emails about your patches, and the commit
> messages for your patches understandable to other people. Otherwise,
> waiting 3 months between when you post the patch and when you commit
> it means nothing. You can wait 10 years to commit and still get
> objections, if other people don't understand what you're doing.

I don't think it's fair to just suppose that much of what I write is
incomprehensible, just like that. Justin expressed the exact opposite
view about the commit message of 44fa8488, for one. I'm not saying
that there is no room at all for improvement, or that my reputation
for being hard to follow is completely undeserved. Just that it's
frustrating to be accused of not having put enough effort into
explaining what's going on with commit 44fa8488 -- I actually put a
great deal of effort into it.

I think that you'd acknowledge that overall, on balance, I must be
doing something right. Have you considered that that might have
happened because of my conceptual style, not in spite of it? I have a
tendency to define things in abstract terms, because it really works
for me. I try to abstract away inessential details, and emphasize
invariance, so that the concepts are easier to manipulate with in many
different contexts -- something that is almost essential when working
on B-Tree stuff. It's harder at first, but much easier later on (at
least for me). If I ever seem to go overboard with it, I ask that you
consider this perspective -- though you should also push back when
that seems appropriate.

Separately, I think that you're missing how hard it would have been to
"just say what you did" while making much sense, given the specifics
here. The complexity in commit 44fa8488 is more or less all due to
historical factors -- it's quite a messy, winding story. For example,
commit 1914c5ea from 2017 added the tupcount_pages field (that I just
removed) to fix a reltuples estimation related bug, that could be
thought of as a bug in the aforementioned commit e8429082 (from 2015).
Now, commit 1914c5ea didn't actually argue that it was fixing a bug in
that earlier commit directly -- whether or not you'd call it that is a
matter of perspective.

This is one of those situations where it's far easier to start with
the ideal, and work forwards, rather than the other way around (the
other way around is often the better approach, but not always). To
some degree it's a return to an earlier era for vacuumlazy.c, when
scanned_pages had a simple and unambiguous definition.

> I would guess that's really the root of Andres's concern here. I
> believe that both Andres and I are in favor of the kinds of things you
> want to do here *in principle*.

I don't doubt that.

> But in practice I feel like it's not
> working well, and thereby putting the project at risk. What if some
> day one of us needs to fix a bug in your code?

Although I hope and expect to be able to fix any bugs that turn up in
my code, and have a good track record there, anything is possible. I
don't think that you'd have a problem fixing any bug in my code,
though. I genuinely think that you'd be able to figure it out pretty
quickly, if it really came down to it.

> It's not like VACUUM is
> some peripheral system where bugs aren't that critical -- and it's
> also not the case that the risk of introducing new bugs is low.
> Historically, it's anything but.

Very true. I'm totally dependent on you and Andres here, as fellow
experts on VACUUM. That's hard for everybody, myself included. I
greatly appreciate your working with me on this, and on the earlier
VACUUM projects. I'm doing my best.

-- 
Peter Geoghegan



Re: Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM

От
Robert Haas
Дата:
On Thu, Feb 17, 2022 at 4:10 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > I would say it differently: I think the commit message does a poor job
> > describing what the commit actually does. For example, it says nothing
> > about changing VACUUM to always scan the last page of every heap
> > relation. This whole thread is about fixing a problem that was caused
> > by a significant behavior change that was *not even mentioned* in the
> > original commit message.
>
> It's not that simple. As I said in the fix-up commit message, and in
> the opening email to this thread, it basically isn't a new behavior at
> all. It would be much more accurate to describe it as a behavior that
> originated with commit e8429082, from late 2015. There was a subtle
> interaction with that existing behavior, and the refactoring work,
> that caused the reltuples problem.

Honestly, I really think it's that simple. It really is possible to
describe what changes a commit is making in the commit message -- and,
in my opinion, you're not doing it.

> > I think you *really* need to put more effort into
> > making your patches, and the emails about your patches, and the commit
> > messages for your patches understandable to other people. Otherwise,
> > waiting 3 months between when you post the patch and when you commit
> > it means nothing. You can wait 10 years to commit and still get
> > objections, if other people don't understand what you're doing.
>
> I don't think it's fair to just suppose that much of what I write is
> incomprehensible, just like that.

I'm not supposing anything. I'm telling you what I can understand, and
what I can't. Unless you intend to accuse me of pretending not to
understand things that I actually do understand, I feel like my word
on that topic should be treated as pretty much conclusive.

I do think that you are doing some things right, for sure. But I don't
think that you are following the community process particularly well.
It doesn't really feel like you feel the need to convince other people
that your changes are in good shape before committing them; and it is
really hard for me to understand in detail what is being changed.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM

От
Peter Geoghegan
Дата:
On Thu, Feb 17, 2022 at 1:36 PM Robert Haas <robertmhaas@gmail.com> wrote:
> > It's not that simple. As I said in the fix-up commit message, and in
> > the opening email to this thread, it basically isn't a new behavior at
> > all. It would be much more accurate to describe it as a behavior that
> > originated with commit e8429082, from late 2015. There was a subtle
> > interaction with that existing behavior, and the refactoring work,
> > that caused the reltuples problem.
>
> Honestly, I really think it's that simple. It really is possible to
> describe what changes a commit is making in the commit message -- and,
> in my opinion, you're not doing it.

The text of the commit message from commit e8429082 (Tom's 2015
commit) talks about "force examination of the table's last page...".
And on that basis I'm saying that I didn't invent the behavior
involving scanning the last page specifically -- which is clearly
true. I *did* overlook an issue that led to this nonrandom
scanned_pages bug, of course, but what should the take away be now?
What does that have to do with the commit message? The significance of
this oversight only became apparent after I committed the patch.

> > > I think you *really* need to put more effort into
> > > making your patches, and the emails about your patches, and the commit
> > > messages for your patches understandable to other people. Otherwise,
> > > waiting 3 months between when you post the patch and when you commit
> > > it means nothing. You can wait 10 years to commit and still get
> > > objections, if other people don't understand what you're doing.
> >
> > I don't think it's fair to just suppose that much of what I write is
> > incomprehensible, just like that.
>
> I'm not supposing anything. I'm telling you what I can understand, and
> what I can't. Unless you intend to accuse me of pretending not to
> understand things that I actually do understand, I feel like my word
> on that topic should be treated as pretty much conclusive.

I'm not disputing that (how could I?). What I'm saying is that from my
perspective, the commit message of 44fa8488 was quite descriptive. I'm
certainly not asking you to agree with that, and I even think that the
fact that you disagree with it should be something that specifically
concerns me. But, there might be some specific instance in the future,
where you find some merit in the way I frame things, having first been
skeptical.

Doesn't that at least seem possible? And aren't you more or less
asking me to give you the same consideration? It's something to
consider -- that's all.

> I do think that you are doing some things right, for sure. But I don't
> think that you are following the community process particularly well.
> It doesn't really feel like you feel the need to convince other people
> that your changes are in good shape before committing them; and it is
> really hard for me to understand in detail what is being changed.

There is almost always ambiguity about these things, which I navigate
to the best of my ability, knowing that I will be judged in the future
based on my track record. It feels as if this discussion has been
muddied by the later work in the patch series, which really does make
pretty big, relatively risky changes to how we do certain things in
vacuumlazy.c. Maybe I shouldn't have referenced that work in the
commit message.

--
Peter Geoghegan



Re: Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM

От
Peter Geoghegan
Дата:
On Thu, Feb 17, 2022 at 10:33 AM Andres Freund <andres@anarazel.de> wrote:
> On 2022-02-16 19:43:09 -0800, Peter Geoghegan wrote:
> > It makes the code in vacuumlazy.c much cleaner. In fact, that's how commit
> > 44fa8488 started off -- purely as refactoring work.
>
> The problem is that it didn't end up as that. You combined refactoring with
> substantial changes. And described it large and generic terms.

What substantial changes are you referring to? The one thing that did
change was the commit message, which framed everything in terms of the
later work. It really is true that the patch that I committed was
essentially the same patch as the one posted on November 22, in both
style and substance. Before I really even began to think about the
freezing stuff. This is a matter of record.

> It's a contentious thread. I asked for things to be split. In that context,
> it's imo common curtesy for non-trivial patches to write something like

I didn't see a way to usefully split up 0001 any further (having
already split it up into 0001 and 0002). That's not particularly
obvious, but it's true.

>   While the rest of the patchset is contentious, I think 0001 is ready to
>   go. I'd like to commit it tomorrow, unless somebody protests.

Okay. I'll be more explicit about it in this way in the future.

> On 2021-11-22 11:29:56 -0800, Andres Freund wrote:
> > I think this is a change mostly in the right direction. But as formulated this
> > commit does *WAY* too much at once.
>
> I do not know how to state more clearly that I think this is not a patch in a
> committable shape.

As I said, I dispute the idea that it could have been split up even
further. My reasons are complicated, and I can get into it if you'd
like.

> but also:
>
> On 2021-11-22 11:29:56 -0800, Andres Freund wrote:
> > I think it should be doable to add an isolation test for this path. There have
> > been quite a few bugs around the wider topic...

Yeah, I really should have included that in 0001 -- I accept that.
That didn't happen in the initial commit, but was high on my todo
list. I can take care of it in the next few days.

--
Peter Geoghegan



Re: Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM

От
Andres Freund
Дата:
Hi,

On 2022-02-17 14:23:51 -0800, Peter Geoghegan wrote:
> On Thu, Feb 17, 2022 at 10:33 AM Andres Freund <andres@anarazel.de> wrote:
> > On 2022-02-16 19:43:09 -0800, Peter Geoghegan wrote:
> > > It makes the code in vacuumlazy.c much cleaner. In fact, that's how commit
> > > 44fa8488 started off -- purely as refactoring work.
> >
> > The problem is that it didn't end up as that. You combined refactoring with
> > substantial changes. And described it large and generic terms.
> 
> What substantial changes are you referring to? The one thing that did
> change was the commit message, which framed everything in terms of the
> later work. It really is true that the patch that I committed was
> essentially the same patch as the one posted on November 22, in both
> style and substance. Before I really even began to think about the
> freezing stuff. This is a matter of record.

Here I was referencing your description of how the patch started ("purely as
refactoring work"), and then evolved into something not just a refactoring.


> > It's a contentious thread. I asked for things to be split. In that context,
> > it's imo common curtesy for non-trivial patches to write something like
> 
> I didn't see a way to usefully split up 0001 any further (having
> already split it up into 0001 and 0002). That's not particularly
> obvious, but it's true.

If helpful I can give a go at showing how I think it could be split up. Or
perhaps more productively, do that on a not-yet-committed larger patch.

Greetings,

Andres Freund



Re: Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM

От
Peter Geoghegan
Дата:
On Thu, Feb 17, 2022 at 4:18 PM Andres Freund <andres@anarazel.de> wrote:
> > What substantial changes are you referring to? The one thing that did
> > change was the commit message, which framed everything in terms of the
> > later work. It really is true that the patch that I committed was
> > essentially the same patch as the one posted on November 22, in both
> > style and substance. Before I really even began to think about the
> > freezing stuff. This is a matter of record.
>
> Here I was referencing your description of how the patch started ("purely as
> refactoring work"), and then evolved into something not just a refactoring.

Of course.

> If helpful I can give a go at showing how I think it could be split up. Or
> perhaps more productively, do that on a not-yet-committed larger patch.

Any help is appreciated.

-- 
Peter Geoghegan