Re: Replacement for Oracle Text
От | Josh berkus |
---|---|
Тема | Re: Replacement for Oracle Text |
Дата | |
Msg-id | 56C750C9.4000500@agliodbs.com обсуждение исходный текст |
Ответ на | Replacement for Oracle Text (Daniel Westermann <daniel.westermann@dbi-services.com>) |
Ответы |
Re: Replacement for Oracle Text
|
Список | pgsql-general |
On 02/19/2016 05:49 AM, s d wrote: > On 19 February 2016 at 14:19, Bruce Momjian <bruce@momjian.us > <mailto:bruce@momjian.us>> wrote: > > I wonder if PLPerl could be used to extract the words from a PDF > document and create a tsvector column from it. > > > I don't know about PLPerl(I'm pretty sure it could be used for this > purpose, though.). On the other hand I've written code for this in > Python which should be easy to adapt for PLPython, if necessary. I'd swear someone already built something to do this. All you need is a library which reads PDF and transforms it into text, and then you can FTS it. I know there's a module for OpenOffice docs somewhere as well, but heck if I can remember where. -- -- Josh Berkus Red Hat OSAS (any opinions are my own)
В списке pgsql-general по дате отправления: