Re: mailing list archiver chewing patches
От | Matteo Beccati |
---|---|
Тема | Re: mailing list archiver chewing patches |
Дата | |
Msg-id | 4B4CD451.200@beccati.com обсуждение исходный текст |
Ответ на | Re: mailing list archiver chewing patches (Magnus Hagander <magnus@hagander.net>) |
Список | pgsql-hackers |
Il 12/01/2010 19:54, Magnus Hagander ha scritto: > On Tue, Jan 12, 2010 at 18:34, Dave Page<dpage@pgadmin.org> wrote: >> On Tue, Jan 12, 2010 at 10:24 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: >>> "Joshua D. Drake"<jd@commandprompt.com> writes: >>>> On Tue, 2010-01-12 at 10:24 +0530, Dave Page wrote: >>>>> So just to put this into perspective and give anyone paying attention >>>>> an idea of the pain that lies ahead should they decide to work on >>>>> this: >>>>> >>>>> - We need to import the old archives (of which there are hundreds of >>>>> thousands of messages, the first few years of which have, umm, minimal >>>>> headers. >>>>> - We need to generate thread indexes >>>>> - We need to re-generate the original URLs for backwards compatibility >>>>> >>>>> Now there's encouragement :-) >>> >>>> Or, we just leave the current infrastructure in place and use a new one >>>> for all new messages going forward. We shouldn't limit our ability to >>>> have a decent system due to decisions of the past. >>> >>> -1. What's the point of having archives? IMO the mailing list archives >>> are nearly as critical a piece of the project infrastructure as the CVS >>> repository. We've already established that moving to a new SCM that >>> fails to preserve the CVS history wouldn't be acceptable. I hardly >>> think that the bar is any lower for mailing list archives. >>> >>> Now I think we could possibly skip the requirement suggested above for >>> URL compatibility, if we just leave the old archives on-line so that >>> those URLs all still resolve. But if we can't load all the old messages >>> into the new infrastructure, it'll basically be useless for searching >>> purposes. >>> >>> (Hmm, re-reading what you said, maybe we are suggesting the same thing, >>> but it's not clear. Anyway my point is that Dave's first two >>> requirements are real. Only the third might not be.) >> >> The third actually isn't actually that hard to do in theory. The >> message numbers are basically the zero-based position in the mbox >> file, and the rest of the URL is obvious. > > The third part is trivial. The search system already does 95% of it. > I've already implemented exactly that kind of redirect thing on top of > the search code once just as a poc, and it was less than 30 minutes of > hacking. Can't seem to find the script ATM though, but you get the > idea. > > Let's not focus on that part, we can easily solve that. Agreed. That's the part that worries me less. Cheers -- Matteo Beccati Development & Consulting - http://www.beccati.com/
В списке pgsql-hackers по дате отправления: