Re: Importing text file into a TEXT field
От | Thomas Kellerer |
---|---|
Тема | Re: Importing text file into a TEXT field |
Дата | |
Msg-id | gf9b50$pvm$1@ger.gmane.org обсуждение исходный текст |
Ответ на | Re: Importing text file into a TEXT field (Bruno Lavoie <bruno.lavoie@gmail.com>) |
Ответы |
Re: Importing text file into a TEXT field
|
Список | pgsql-general |
Bruno Lavoie, 07.11.2008 19:20: > Hello, > > The intent is to use pdftotext and store the resulting text in datbase > for full text search purposes... I'm trying to develop a mini content > server where I'll put pdf documents to make it searchable. > > Generally, PDFs are in size of 500 to 3000 pages resulting in text from > 500kb to 2megabytes... > > I'm also looking at open source projects like Alfresco if it can serve > with ease to my purpose... Anyone use this one? Comments are welcome. If you are not bound to "native" Postgres tools, you might want to take a look at my SQL Workbench/J (http://www.sql-workbench.net) It can insert the contents of files (located on the client) into tables. You can either do this using an extended SQL syntax: UPDATE pdf_table SET text_content = {$clobfile=c:/temp/convertet.txt encoding=utf8} WHERE id = 42; (of course this statement can not be run with psql) You could also bulk-upload several files at one using my flat-file import. (http://www.sql-workbench.net/manual/command-import.html) Assuming the table has two columns (id, text_content), the flat file would look like this: id|text_content 1|content_1.txt 2|content_2.txt 3|content_3.txt and the import would store the content of the files not the literl 'content_1.txt' in the column text_content. You can either insert or update the content, depending on your needs. You could even store the orginal pdf file if the tablecontains a bytea column for the blob data. Contact me offline (contact information on my homepage) if you need help. Regards Thomas
В списке pgsql-general по дате отправления: