Обсуждение: BUG #18775: PQgetCopyData always has an out-of-memory error if the table field stores bytea ~700 MB
BUG #18775: PQgetCopyData always has an out-of-memory error if the table field stores bytea ~700 MB
От
PG Bug reporting form
Дата:
The following bug has been logged on the website: Bug reference: 18775 Logged by: Ilya Email address: knuazev@gmail.com PostgreSQL version: 16.0 Operating system: windows 10 Description: Hello, I need your help. The essence of the problem is that I am trying to download a bytea field from a table row. The size of the data in the field is about 700 MB. In response, I receive an out-of-memory error. More details: 1) I execute the query COPY (SELECT %s FROM %s.%s WHERE %s=%s) TO STDOUT BINARY 2) I call the PQgetCopyData function. This function works with the pg_conn.inBuffer buffer. I did not go into detail on how memory is allocated for it, but in my case, about 300 MB is allocated. And the data from the database is successfully accumulated in this buffer. And when the buffer is almost full, the function int pqGetCopyData3(PGconn *conn, char **buffer, int async) tries to allocate a buffer almost the same size as pg_conn.inBuffer in order to give it to the user: *buffer = (char *) malloc(msgLength + 1); (fe-protosol.c LN. 1767). In my case, the malloc function consistently fails to do this. It returns NULL and the library generates an "out of memory" error. p.s. It seems to me that the only chance to make PQgetCopyData work correctly without changing the library interface is to somehow reduce the size of the pg_conn.inBuffer buffer to small values. And ideally, it would be good to add the PQgetCopyData2 function that would accept a buffer and simply fill it with data and return the filling size. I also tried to get around this problem using the outdated PQgetline function, but it does not work with binary data. And if you make a COPY (SELECT %s FROM %s.%s WHERE %s=%s) TO STDOUT query, an error will occur when trying to execute it. Because, perhaps, the library is trying to download everything at once into pg_conn.inBuffer, which is too small for this. And yes, I know that for storing large data you have "large object facility". But my method should also work for files up to 1 GB.
PG Bug reporting form <noreply@postgresql.org> writes: > Hello, I need your help. The essence of the problem is that I am trying to > download a bytea field from a table row. The size of the data in the field > is about 700 MB. In response, I receive an out-of-memory error. I don't see this as a Postgres bug. If you want to work with values that large, you'd better have plenty of memory available. regards, tom lane
Thanks for the answer. I agree that if you want to work with large data, then make sure you have enough memory. If you call the PQexec function with a regular select and then PQgetvalue to get the value. And you get an error - then it's your own fault.
But I know that there may not be enough memory, so I use the "copy" keyword in the query and the PQgetCopyData function. I thought that this function was designed for portioned work. By analogy with the PQputCopyData function, which works fine.
I found a way out. I use "SELECT substring(%s from %d for %d) as chunk from %s.%s WHERE %s = %s" with PQunescapeBytea in a loop. But it looks strange and takes longer.
p .s
I specified that my file is 300+ MB, not 700+.
But I know that there may not be enough memory, so I use the "copy" keyword in the query and the PQgetCopyData function. I thought that this function was designed for portioned work. By analogy with the PQputCopyData function, which works fine.
I found a way out. I use "SELECT substring(%s from %d for %d) as chunk from %s.%s WHERE %s = %s" with PQunescapeBytea in a loop. But it looks strange and takes longer.
p .s
I specified that my file is 300+ MB, not 700+.
чт, 16 янв. 2025 г. в 19:03, Tom Lane <tgl@sss.pgh.pa.us>:
PG Bug reporting form <noreply@postgresql.org> writes:
> Hello, I need your help. The essence of the problem is that I am trying to
> download a bytea field from a table row. The size of the data in the field
> is about 700 MB. In response, I receive an out-of-memory error.
I don't see this as a Postgres bug. If you want to work with values
that large, you'd better have plenty of memory available.
regards, tom lane
С уважением,
Князев И. Н.

Ilya Knyazev <knuazev@gmail.com> writes: > But I know that there may not be enough memory, so I use the "copy" keyword > in the query and the PQgetCopyData function. I thought that this function > was designed for portioned work. By analogy with the PQputCopyData > function, which works fine. Its documentation is fairly clear, I thought: Attempts to obtain another row of data from the server during a <command>COPY</command>. Data is always returned one data row at a time; if only a partial row is available, it is not returned. If you need to work with data values that are large enough to risk memory problems, I think "large objects" are the best answer. Their interface is a bit clunky, but it's at least designed to let you both read and write by chunks. regards, tom lane