Обсуждение: is it possible to flush / commit / update underlying table during the COPY table FROM 'pipe'

Поиск
Список
Период
Сортировка

is it possible to flush / commit / update underlying table during the COPY table FROM 'pipe'

От
Trent George
Дата:
Hi,

I am experimenting on postgresql 9.0 windows
I am using CreateNamedPipe in windows

I am using the command (with an corresponding custom win32 console app
that opens and sends data)
copy table from e'\\\\.\\pipe\testpipe' with delimiter e'\t';

Everything is working as expected and documented.

I am hoping to leave this connection (pipe) open for an extended
period of time, and send fresh data as needed (from a source
application)

My problem and question is that  the data is not available in the
table until I end / close the pipe.
I test this using "select count(*) from table;"  in another connection.
The moment the pipe is closed, all the data becomes available

I assume an underlying begin / commit is happening under the hood at
the start / end of the copy from command

Any amount of flushing, or size of data (tested up to 50 mb) does not
make the data available to the table before the command finishes.
I also experimented with the end-of-data "\\.\n" but all this does it
break the pipe (you should not send data after this)

Is it (or would it) be possible to allow some type of equivalent of
BATCHSIZE as used in mssql BULK INSERT command
This would group and commit this number of records during the import,
while it is still running
One could in theory use a value of =1, =1000 or leave the default (whole batch)

This would have multiple advantages
1/ make data available during long imports (because of bandwidth, file
size, or other delays)
2/ not loose all data if you have a problem records
3/ not loose all data if you get a connection (pipe) die
4/ allow for 'open' connections to allow continuous back end import
programs to feed data as it becomes available.
5/ as it is an optional, non default setting, no existing program
would behave different, but new one could take advantage of it.

Trent

On Mar 26, 2011, at 5:31 PM, Trent George wrote:

> Is it (or would it) be possible to allow some type of equivalent of
> BATCHSIZE as used in mssql BULK INSERT command
> This would group and commit this number of records during the import,
> while it is still running
> One could in theory use a value of =1, =1000 or leave the default (whole batch)

There is nothing built-in for what you describe, but you could accomplish the desired result on your end. e.g. split
theimport into batches of 1000 (or whatever) and issue individual COPY commands for each one. This could all be wrapped
intoa sh/perl/whatever script to make it reusable, configurable, add error handling, etc. 

Jon