Обсуждение: Mirror.php performance
Well mirror.php's performance is *far* better than it was, though there is clearly still room for improvement. However, something is not right - there are over 7000 docs pages if counting static and interactive: Nov 04 08:53:38 mirror [info] Mirroring started Nov 04 08:57:09 mirror [error] HTTP error 404 at page http://wwwdevel.postgresql.org/images/editorschoice2003.jpg Nov 04 09:01:46 mirror [error] HTTP error 404 at page http://wwwdevel.postgresql.org/presskit/en/presskit74.html Nov 04 09:02:47 mirror [error] HTTP error 404 at page http://wwwdevel.postgresql.org/pgsql-bugs@postgresql.org Nov 04 09:16:04 mirror [info] Mirroring finished. 1027 page(s) saved, 1346 second(s) spent It appears to have saved everything in the root directory afaict, and the 7.4 static docs, but nothing else. Any ideas? Regards, Dave.
Hi, Dave Page wrote: > Nov 04 09:16:04 mirror [info] Mirroring finished. 1027 page(s) saved, > 1346 second(s) spent > > It appears to have saved everything in the root directory afaict, and > the 7.4 static docs, but nothing else. > > Any ideas? Ouch. It did the same for me, will look into this: seems as if some links are dropped / not followed.
Hi, Alexey Borzov wrote: >> Nov 04 09:16:04 mirror [info] Mirroring finished. 1027 page(s) saved, >> 1346 second(s) spent >> >> It appears to have saved everything in the root directory afaict, and >> the 7.4 static docs, but nothing else. >> >> Any ideas? > > Ouch. It did the same for me, will look into this: seems as if some > links are dropped / not followed. Fixed. Turned out the regexes to extract links from pages were broken and some of the links (including the main menu, unfortunately) were thus not crawled.
> -----Original Message----- > From: Alexey Borzov [mailto:borz_off@cs.msu.su] > Sent: 04 November 2004 13:03 > To: Dave Page > Cc: pgsql-www@postgresql.org > Subject: Re: [pgsql-www] Mirror.php performance > > Hi, > > Alexey Borzov wrote: > >> Nov 04 09:16:04 mirror [info] Mirroring finished. 1027 > page(s) saved, > >> 1346 second(s) spent > >> > >> It appears to have saved everything in the root directory > afaict, and > >> the 7.4 static docs, but nothing else. > >> > >> Any ideas? > > > > Ouch. It did the same for me, will look into this: seems as if some > > links are dropped / not followed. > > Fixed. Turned out the regexes to extract links from pages > were broken and some of the links (including the main menu, > unfortunately) were thus not crawled. Thanks, I'll give it a try. Regard,s dave.