Re: web archiving

Поиск

Список

Период

Сортировка

От	Philip Hallstrom
Тема	Re: web archiving
Дата	10 июля 2002 г. 18:21:54
Msg-id	20020710152041.G672-100000@cypress.adhesivemedia.com обсуждение исходный текст
Ответ на	web archiving (Matt Price <matt.price@utoronto.ca>)
Ответы	Re: web archiving
Список	pgsql-novice

Дерево обсуждения

Not to discourage you from using postgresql or writing it yourself, but
you might want to take a look at wget (for downloading the web pages) and
mngosearch or htdig for searching them.

mngosearch supports postgresql and has a PHP interface so you can have fun
with that...

On 10 Jul 2002, Matt Price wrote:

> Hi there,
>
> I've just moved up from non-free os's to debian linux, and installed
> postgresql, with the hope of getting started on some projects I've been
> thinking about.  Several of these projects involve web archives.  The
> idea is, a url is entered with a bunch of bibliographic-type data in
> other fields (keywords, author, date, etc).  The html (and hopefully,
> accompanying images/css's/etc) are then grabbed using curl, and archived
> in a postgresql database.  A web or other gui interface then provides
> fully-searchable access to the archive for later use.
>
> So my question:  does anyone know of a similar tool which already
> exists?  I'm a complete novice at database programming (and at php, too,
> which is what I figured I'd use as the scripting language, though I'd
> consider learning perl or java if folks think that's a much better
> idea), and I'd rather work with some pre-existing code than start from
> the ground up.  Any suggestings?  Is this the right list to be asking
> this quesiton on?
>
> Thanks loads,
> Matt
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>

В списке pgsql-novice по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: web archiving