Re: web archiving
От | Philip Hallstrom |
---|---|
Тема | Re: web archiving |
Дата | |
Msg-id | 20020710152041.G672-100000@cypress.adhesivemedia.com обсуждение исходный текст |
Ответ на | web archiving (Matt Price <matt.price@utoronto.ca>) |
Ответы |
Re: web archiving
|
Список | pgsql-novice |
Not to discourage you from using postgresql or writing it yourself, but you might want to take a look at wget (for downloading the web pages) and mngosearch or htdig for searching them. mngosearch supports postgresql and has a PHP interface so you can have fun with that... On 10 Jul 2002, Matt Price wrote: > Hi there, > > I've just moved up from non-free os's to debian linux, and installed > postgresql, with the hope of getting started on some projects I've been > thinking about. Several of these projects involve web archives. The > idea is, a url is entered with a bunch of bibliographic-type data in > other fields (keywords, author, date, etc). The html (and hopefully, > accompanying images/css's/etc) are then grabbed using curl, and archived > in a postgresql database. A web or other gui interface then provides > fully-searchable access to the archive for later use. > > So my question: does anyone know of a similar tool which already > exists? I'm a complete novice at database programming (and at php, too, > which is what I figured I'd use as the scripting language, though I'd > consider learning perl or java if folks think that's a much better > idea), and I'd rather work with some pre-existing code than start from > the ground up. Any suggestings? Is this the right list to be asking > this quesiton on? > > Thanks loads, > Matt > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >
В списке pgsql-novice по дате отправления: