Re: tsearch2 and pdf files
От | Henrik Zagerholm |
---|---|
Тема | Re: tsearch2 and pdf files |
Дата | |
Msg-id | 179575E2-2F49-427F-9961-CEE966187950@mac.se обсуждение исходный текст |
Ответ на | Re: tsearch2 and pdf files ("Philip Johnson" <philip.johnson@atempo.com>) |
Ответы |
Re: tsearch2 and pdf files
|
Список | pgsql-general |
1. Convert PDF to file with e.g xpdf 2. Insert parsed text to a table of your choice. 3. Make vectors from the text. Cheers, 11 dec 2006 kl. 18:23 skrev Philip Johnson: > Do you know what kind of table should I use ? > Is there a shell script or a php script that does the work ? > > regards > >> -----Message d'origine----- >> De : pgsql-general-owner@postgresql.org [mailto:pgsql-general- >> owner@postgresql.org] De la part de Hannes Dorbath >> Envoyé : lundi 11 décembre 2006 12:21 >> À : pgsql-general@postgresql.org >> Objet : Re: [GENERAL] tsearch2 and pdf files >> >> You just need software that extracts the text from it. Search >> google for >> pdf2txt and others. Printer drivers that try to get text from >> anything >> are available as well. >> >> >> On 11.12.2006 11:41, Philip Johnson wrote: >>> I'm using Postgresql 8.1.5 >>> >>> Tsearch2 is installed and runs well >>> >>> I'd like to use tsearch2 to index PDF files. >>> >>> Do someone has a detailed process to implement that? >> >> >> -- >> Regards, >> Hannes Dorbath >> >> ---------------------------(end of >> broadcast)--------------------------- >> TIP 5: don't forget to increase your free space map settings > > > ---------------------------(end of > broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org/
В списке pgsql-general по дате отправления: