Обсуждение: Having postgresql.org link to cgit instead of gitweb

Поиск
Список
Период
Сортировка

Having postgresql.org link to cgit instead of gitweb

От
"Jonathan S. Katz"
Дата:
Hi,

While prepping the website for the PG18 GA, I stumbled on the inability 
to access parts of commits through the gitweb links, specifically 
hitting 429 status code errors (this seems to be intermittent). After 
some briefing on why it's disabled and how this isn't an issue with 
cgit, I prepped a patch for postgresql.org (the main website) that would 
update the git.postgresql.org reference to use cgit instead of gitweb.

However, as this could impact some hacker workflows (e.g. the commit 
search page), I wanted to run this patch by -hackers before committing. 
Basically, the patch:

* Moves any web links to git.postgresql.org repos to use the cgit 
interface instead of gitweb (e.g. [1])
* Update the commit search[2] to use cgit instead of gitweb

Please note that this doesn't impact the availability of gitweb, rather 
the main parts of the postgresql.org website will link to cgit first, 
and people will have a more consistent experience overall (e.g. no 429 
errors).

Thoughts?

Thanks,

Jonathan

[1] https://www.postgresql.org/developer/related-projects/
[2] https://www.postgresql.org/developer/coding/

Вложения

Re: Having postgresql.org link to cgit instead of gitweb

От
David Rowley
Дата:
On Fri, 19 Sept 2025 at 13:12, Jonathan S. Katz <jkatz@postgresql.org> wrote:
> While prepping the website for the PG18 GA, I stumbled on the inability
> to access parts of commits through the gitweb links, specifically
> hitting 429 status code errors (this seems to be intermittent). After
> some briefing on why it's disabled and how this isn't an issue with
> cgit, I prepped a patch for postgresql.org (the main website) that would
> update the git.postgresql.org reference to use cgit instead of gitweb.

> Please note that this doesn't impact the availability of gitweb, rather
> the main parts of the postgresql.org website will link to cgit first,
> and people will have a more consistent experience overall (e.g. no 429
> errors).

You didn't mention the cause of the specific issues, but it has been
mentioned on www lists before, so I don't think it's a secret with the
bot traffic.  Have you considered if switching these links to cgit
wouldn't just cause the traffic to migrate to cgit, over time? If so,
would you just be moving the problem from one place to another? I
mean, the bots are getting the links from somewhere. I'd imagine
release notes and the likes to be a popular source of links.

Perhaps someone with more knowledge than I have on the problem can
comment to give insight into if the same issue could occur with cgit.

David



Re: Having postgresql.org link to cgit instead of gitweb

От
Álvaro Herrera
Дата:
On 2025-Sep-19, David Rowley wrote:

> You didn't mention the cause of the specific issues, but it has been
> mentioned on www lists before, so I don't think it's a secret with the
> bot traffic.  Have you considered if switching these links to cgit
> wouldn't just cause the traffic to migrate to cgit, over time?

I think this will happen, yes.  There are two problems here actually:
the first one is that the old gitweb program, implemented in Perl, is
awfully slow itself.  Git itself is fast enough for most things and I
don't think serving its output efficiently, as cgit does, is going to be
a performance problem.  So for the `blob` objects, which is what this is
mostly used for, we should be fine with cgit.

The other problem is `git blame`, which can be slow also with pure git,
so if (when) the bots move to run blame with cgit, then we'll be in
trouble just as well, and we're going to need some gating in order to
prevent trouble.  However, `blame` hasn't been as much of a problem as
`blob` has, so we can take this more leisurely.

There are two things we could do.  One is to simply restrict `git blame`
to authenticated users; this shouldn't be _too_ bad.  But if we don't
want that, we could put the bot checker javascript tricks in front of
`blame`.  In fact maybe we could have the best of both worlds: you get
the javascript check if you're not authenticated, but nothing if you
are.  I'm not sure how easy it is to implement this though.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/



Re: Having postgresql.org link to cgit instead of gitweb

От
Peter Eisentraut
Дата:
On 19.09.25 03:12, Jonathan S. Katz wrote:
> * Moves any web links to git.postgresql.org repos to use the cgit 
> interface instead of gitweb (e.g. [1])
> * Update the commit search[2] to use cgit instead of gitweb

If we're doing that -- which seems reasonable -- then perhaps also 
update the forwarder for the links sent to pgsql-committers, like

https://git.postgresql.org/pg/commitdiff/ed1aad15e09d7d523f4ef413e3c4d410497c8065

This might be related to the second item, not sure.




Re: Having postgresql.org link to cgit instead of gitweb

От
Peter Eisentraut
Дата:
On 19.09.25 10:22, Álvaro Herrera wrote:
> There are two things we could do.  One is to simply restrict `git blame`
> to authenticated users; this shouldn't be_too_ bad.  But if we don't
> want that, we could put the bot checker javascript tricks in front of
> `blame`.  In fact maybe we could have the best of both worlds: you get
> the javascript check if you're not authenticated, but nothing if you
> are.  I'm not sure how easy it is to implement this though.

Or just disable git blame.  Who needs to run that through the website?



Re: Having postgresql.org link to cgit instead of gitweb

От
Daniel Gustafsson
Дата:
> On 19 Sep 2025, at 13:05, Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 19.09.25 10:22, Álvaro Herrera wrote:
>> There are two things we could do.  One is to simply restrict `git blame`
>> to authenticated users; this shouldn't be_too_ bad.  But if we don't
>> want that, we could put the bot checker javascript tricks in front of
>> `blame`.  In fact maybe we could have the best of both worlds: you get
>> the javascript check if you're not authenticated, but nothing if you
>> are.  I'm not sure how easy it is to implement this though.
>
> Or just disable git blame.  Who needs to run that through the website?

We could jut link to the postgres mirror on Github for that.

--
Daniel Gustafsson




Re: Having postgresql.org link to cgit instead of gitweb

От
David Rowley
Дата:
On Fri, 19 Sept 2025 at 23:05, Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 19.09.25 10:22, Álvaro Herrera wrote:
> > There are two things we could do.  One is to simply restrict `git blame`
> > to authenticated users; this shouldn't be_too_ bad.  But if we don't
> > want that, we could put the bot checker javascript tricks in front of
> > `blame`.  In fact maybe we could have the best of both worlds: you get
> > the javascript check if you're not authenticated, but nothing if you
> > are.  I'm not sure how easy it is to implement this though.
>
> Or just disable git blame.  Who needs to run that through the website?

I'd vote for getting rid of the blame if it could buy us back enough
CPU cycles to have diff working again. I personally miss not having
diff. I found it convenient when following links to see what's been
changed from the pgsql-committers list.

David



Re: Having postgresql.org link to cgit instead of gitweb

От
"Jonathan S. Katz"
Дата:
On 9/19/25 7:42 AM, David Rowley wrote:
> On Fri, 19 Sept 2025 at 23:05, Peter Eisentraut <peter@eisentraut.org> wrote:
>>
>> On 19.09.25 10:22, Álvaro Herrera wrote:
>>> There are two things we could do.  One is to simply restrict `git blame`
>>> to authenticated users; this shouldn't be_too_ bad.  But if we don't
>>> want that, we could put the bot checker javascript tricks in front of
>>> `blame`.  In fact maybe we could have the best of both worlds: you get
>>> the javascript check if you're not authenticated, but nothing if you
>>> are.  I'm not sure how easy it is to implement this though.
>>
>> Or just disable git blame.  Who needs to run that through the website?
> 
> I'd vote for getting rid of the blame if it could buy us back enough
> CPU cycles to have diff working again. I personally miss not having
> diff. I found it convenient when following links to see what's been
> changed from the pgsql-committers list.

With the disclaimer that I'm not the target audience for this work, I've 
previously used the "git blame" web feature on git.postgresql.org to 
figure some stuff out, but these days I just use the Github one as 
Daniel mentioned. I do think the absence of diff is less than ideal, and 
definitely something that I use fairly frequently even if I'm not 
hacking often.

For the website/patch itself (gitweb vs. cgit), again I'm not the target 
audience, so I'll defer to what you all want and particularly want to 
ensure your lives are easier. However, with the upcoming traffic spike 
with GA, I do want to ensure that our linked things are still working, 
which is what prompted the discussion.

Jonathan

Вложения

Re: Having postgresql.org link to cgit instead of gitweb

От
Peter Geoghegan
Дата:
On Thu, Sep 18, 2025 at 9:12 PM Jonathan S. Katz <jkatz@postgresql.org> wrote:
> While prepping the website for the PG18 GA, I stumbled on the inability
> to access parts of commits through the gitweb links, specifically
> hitting 429 status code errors (this seems to be intermittent). After
> some briefing on why it's disabled and how this isn't an issue with
> cgit, I prepped a patch for postgresql.org (the main website) that would
> update the git.postgresql.org reference to use cgit instead of gitweb.

cgit messes up indentation by showing 8 space tabs (not 4 space tabs)
-- that's certainly not ideal.

I understand that the same problem was fixed within gitweb by patching
the source code.

--
Peter Geoghegan



Re: Having postgresql.org link to cgit instead of gitweb

От
Álvaro Herrera
Дата:
On 2025-Sep-19, Peter Eisentraut wrote:

> On 19.09.25 03:12, Jonathan S. Katz wrote:
> > * Moves any web links to git.postgresql.org repos to use the cgit
> > interface instead of gitweb (e.g. [1])
> > * Update the commit search[2] to use cgit instead of gitweb
> 
> If we're doing that -- which seems reasonable -- then perhaps also update
> the forwarder for the links sent to pgsql-committers, like
> 
> https://git.postgresql.org/pg/commitdiff/ed1aad15e09d7d523f4ef413e3c4d410497c8065
> 
> This might be related to the second item, not sure.

No, I think Jonathan wasn't thinking of these links when he mentioned
that second item.  I do have the /pg/commitdiff/ URLs in mind, but
that's a pginfra configuration file that needs to be changed.  I'll
see about changing that as well, because I've been bitten by this
problem there too.

BTW regarding Jon's second item, I was again reminded that we have
this "backend flowchart" page there,
https://www.postgresql.org/developer/backend/
I think this is a prime example of something that we could do much
better by adding one more item to our numerous collection of diagrams in
the docbook core docs.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/



Re: Having postgresql.org link to cgit instead of gitweb

От
"Jonathan S. Katz"
Дата:
On 9/19/25 10:47 AM, Álvaro Herrera wrote:
> On 2025-Sep-19, Peter Eisentraut wrote:
> 
>> On 19.09.25 03:12, Jonathan S. Katz wrote:
>>> * Moves any web links to git.postgresql.org repos to use the cgit
>>> interface instead of gitweb (e.g. [1])
>>> * Update the commit search[2] to use cgit instead of gitweb
>>
>> If we're doing that -- which seems reasonable -- then perhaps also update
>> the forwarder for the links sent to pgsql-committers, like
>>
>> https://git.postgresql.org/pg/commitdiff/ed1aad15e09d7d523f4ef413e3c4d410497c8065
>>
>> This might be related to the second item, not sure.
> 
> No, I think Jonathan wasn't thinking of these links when he mentioned
> that second item.

I can confirm that I was thinking about them in the second item; I was 
thinking about them though, but was unsure if it needed to be in this 
discussion as it isn't directly in the pgweb scope. But holistically, I 
guess it does.

>  I do have the /pg/commitdiff/ URLs in mind, but
> that's a pginfra configuration file that needs to be changed.  I'll
> see about changing that as well, because I've been bitten by this
> problem there too.
> 
> BTW regarding Jon's second item, I was again reminded that we have
> this "backend flowchart" page there,
> https://www.postgresql.org/developer/backend/
> I think this is a prime example of something that we could do much
> better by adding one more item to our numerous collection of diagrams in
> the docbook core docs.

And we support images now (and for a few releases)!

Jonathan

Вложения

Re: Having postgresql.org link to cgit instead of gitweb

От
Tom Lane
Дата:
Peter Geoghegan <pg@bowt.ie> writes:
> cgit messes up indentation by showing 8 space tabs (not 4 space tabs)
> -- that's certainly not ideal.

To me that seems like a complete blocker for this proposal,
if we can't find a fix.

            regards, tom lane



Re: Having postgresql.org link to cgit instead of gitweb

От
"Jonathan S. Katz"
Дата:
On 9/19/25 12:17 PM, Tom Lane wrote:
> Peter Geoghegan <pg@bowt.ie> writes:
>> cgit messes up indentation by showing 8 space tabs (not 4 space tabs)
>> -- that's certainly not ideal.
> 
> To me that seems like a complete blocker for this proposal,
> if we can't find a fix.

On a quick read, I believe this is easily settable in the cgit.css file 
by setting "tab-size" to "4". I did a quick test hacking this inline, 
and it worked.

Further, it appears we already attempt to do this in a "4space.css" file 
we serve, but it needs to be edited with the updated cgit HTML/CSS.

Thanks,

Jonathan

Вложения

Re: Having postgresql.org link to cgit instead of gitweb

От
Tom Lane
Дата:
"Jonathan S. Katz" <jkatz@postgresql.org> writes:
> On a quick read, I believe this is easily settable in the cgit.css file 
> by setting "tab-size" to "4". I did a quick test hacking this inline, 
> and it worked.

Cool, thanks for looking into it.

            regards, tom lane



Re: Having postgresql.org link to cgit instead of gitweb

От
"Jonathan S. Katz"
Дата:
On 9/19/25 4:14 PM, Tom Lane wrote:
> "Jonathan S. Katz" <jkatz@postgresql.org> writes:
>> On a quick read, I believe this is easily settable in the cgit.css file
>> by setting "tab-size" to "4". I did a quick test hacking this inline,
>> and it worked.
> 
> Cool, thanks for looking into it.

Tested inline, but untested as a whole (as I don't have access to 
gitweb, nor do I really want to have access), but this is effectively 
the modification, the second line of the CSS rule.

Jonathan

Вложения

Re: Having postgresql.org link to cgit instead of gitweb

От
"Jonathan S. Katz"
Дата:
On 9/19/25 4:54 PM, Jonathan S. Katz wrote:
> On 9/19/25 4:14 PM, Tom Lane wrote:
>> "Jonathan S. Katz" <jkatz@postgresql.org> writes:
>>> On a quick read, I believe this is easily settable in the cgit.css file
>>> by setting "tab-size" to "4". I did a quick test hacking this inline,
>>> and it worked.
>>
>> Cool, thanks for looking into it.
> 
> Tested inline, but untested as a whole (as I don't have access to 
> gitweb, nor do I really want to have access), but this is effectively 
> the modification, the second line of the CSS rule.

If the main concern is lack of diff - which cgit gives us back, and the 
main objection is the tab-size patch (in previous email)[1], is there 
any objection to moving forward with updating the URLs after this patch 
is applied (which I can't do, as I don't have privileges to that server)?

If there are objections, I'm fine to wait until after the release to 
re-open discussion.

Jonathan

[1] 
https://www.postgresql.org/message-id/38cfb119-a150-4899-8879-73e3ace66a6a%40postgresql.org

Вложения

Re: Having postgresql.org link to cgit instead of gitweb

От
Tom Lane
Дата:
"Jonathan S. Katz" <jkatz@postgresql.org> writes:
> If the main concern is lack of diff - which cgit gives us back, and the 
> main objection is the tab-size patch (in previous email)[1], is there 
> any objection to moving forward with updating the URLs after this patch 
> is applied (which I can't do, as I don't have privileges to that server)?

Not here.

> If there are objections, I'm fine to wait until after the release to 
> re-open discussion.

My first thought about scheduling was "best not in the middle of the
18.0 release cycle".  However, I don't know of any actual connection
between gitweb/cgit and the release-making tasks.  My second thought
was "the point here is to cut server load, and maybe we need that to
happen before the anticipated traffic spike on Thursday".  There
might not be any connection there either, but if there is, agreed
to get it done sooner not later.

            regards, tom lane



Re: Having postgresql.org link to cgit instead of gitweb

От
Álvaro Herrera
Дата:
On 2025-Sep-22, Tom Lane wrote:

> My first thought about scheduling was "best not in the middle of the
> 18.0 release cycle".  However, I don't know of any actual connection
> between gitweb/cgit and the release-making tasks.  My second thought
> was "the point here is to cut server load, and maybe we need that to
> happen before the anticipated traffic spike on Thursday".  There
> might not be any connection there either, but if there is, agreed
> to get it done sooner not later.

I think the traffic overloads are mostly caused by LLM scrapers, which
as far as I know does not correlate with spikes caused by human behavior
or even those caused by mirroring traffic during a new release or such.

I would rather wait until next week, just in case something breaks.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"This is what I like so much about PostgreSQL.  Most of the surprises
are of the "oh wow!  That's cool" Not the "oh shit!" kind.  :)"
Scott Marlowe, http://archives.postgresql.org/pgsql-admin/2008-10/msg00152.php



Re: Having postgresql.org link to cgit instead of gitweb

От
"Jonathan S. Katz"
Дата:
On 9/22/25 11:27 AM, Álvaro Herrera wrote:
> On 2025-Sep-22, Tom Lane wrote:
> 
>> My first thought about scheduling was "best not in the middle of the
>> 18.0 release cycle".  However, I don't know of any actual connection
>> between gitweb/cgit and the release-making tasks.  My second thought
>> was "the point here is to cut server load, and maybe we need that to
>> happen before the anticipated traffic spike on Thursday".  There
>> might not be any connection there either, but if there is, agreed
>> to get it done sooner not later.
> 
> I think the traffic overloads are mostly caused by LLM scrapers, which
> as far as I know does not correlate with spikes caused by human behavior
> or even those caused by mirroring traffic during a new release or such.
> 
> I would rather wait until next week, just in case something breaks.

I'm fine with this approach, for the above reasons. The web patch won't 
bit shift too much between now and then.

Jonathan

Вложения