Обсуждение: PQgetvalue failed to return column value for non-text data in binary format

Поиск
Список
Период
Сортировка

PQgetvalue failed to return column value for non-text data in binary format

От
Jeff Lynn
Дата:
Hi,

I am new to PostgreSQL and tried using the libpq C library.  I can read 
a test table using PQgetvalue for text format, i.e. format = 0.  
However, when I experiment with using binary output format, I was unable 
to read anything else except result set fiend that is in char(x), 
varchar(x), i.e., character string data.

What have I done wrong?
  int *     pId; LPSTR pszName; double * pSalary;
  PGresult *pResult = PQexecParams(m_pConn, (LPCTSTR)strSelect, nParams,                 parmTypes, paramValues,
parmLengths,parmFormats, 1);
 
   pId = (int *)PQgetvalue(pResult, row, 0);   pszName = (LPSTR)PQgetvalue(pResult, row, 1);   pSalary = (double
*)PQgetvalue(pResult,row, 6);
 
   printf("id: %d name: %s salary: %f", *pId, pszName, *pSalary);     
What I got is as follow:

+ + + printing resultset, 2 rows, 8 columns
id: 167772160 name: Amy Who salary: 0.000000
id: 16777216 name: David Smith salary: 0.000000

Which supposed to return:
id: 10 name: Amy Who salary: 150000.00
id: 1  name: David Smith salary: 350000.00

Any help is greatly appreciated!

Jeff




Re: PQgetvalue failed to return column value for non-text data in binary format

От
Andrew McNamara
Дата:
>I am new to PostgreSQL and tried using the libpq C library.  I can read 
>a test table using PQgetvalue for text format, i.e. format = 0.  
>However, when I experiment with using binary output format, I was unable 
>to read anything else except result set fiend that is in char(x), 
>varchar(x), i.e., character string data.
>
>What have I done wrong?

The data returned in binary mode is in a (largely undocumented)
PostgreSQL form. Types like int4, and float8 are returned in network
byte order [1], while other types such as timestamps are in more subtle
formats. The only canonical reference to these formats is the PostgreSQL
source. 

In general, you should just use the text format, as it's more portable
and less likely to change over time, although the specifics of the text
format are also largely undocumented.

[1] http://en.wikipedia.org/wiki/Endianness

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/


Re: PQgetvalue failed to return column value for non-text data in binary format

От
Jeff Lynn
Дата:
Thanks you very much for your help and clarification.  As programmer in 
heart, one tends to go straight the specific api.  I guess once one 
become so comfortable with a particular open source software, then one 
can go to the source to understand what is happening there.  But in my 
case, I guess I am far from that level of sophistication in 
understanding PostgreSQL.

This is interesting position to be taken by PostgreSQL.  For simple 
query, particularly with regards to reporting, one can go about using 
text for all column.  But a lot of time, I deal with large dataset 
ingestions and extracts that tends to take many hours to perform.  So I 
usually prefer binary form data for performance reason.  To that end, I 
found there is a significant different in the time it takes to ingest or 
extract data from one database to another database of totally different 
vendor.  Most of our company's works are done in Java nowadays, but for 
large data ingestions/extract/conversion, a C program out performs Java 
program fair and square, no debate there (based on actual run-time 
data).  For clarification, we deal with various business partners that 
we have to import/export large amount of data between heterogeneous 
databases that the wonderful technology of database replication does not 
apply for our use.

For this kind of operations, I even found the performance gain by 
creating C program out-weight a Java program for that same purpose.

I can get by using the socket function, ntohl() and ntohs() for smallint 
and int4 for the conversion.  But there is no such thing for the float4 
and float8 as well as the date, timestamp you mentioned.  I wish there 
is such function or macro available for the C interface. 

I do not agree with the "ease of portability" issue, because if that is 
the case, one can always give up performance and use Java for that 
purpose.  Beside, it should not be that much more difficult to provide 
win32 exe for PostgreSQL with PQgetvalue() that return int2, int4, 
float4, float8, date, datetime/timestamp, etc.

Jeff,

Andrew McNamara wrote:
> The data returned in binary mode is in a (largely undocumented)
> PostgreSQL form. Types like int4, and float8 are returned in network
> byte order [1], while other types such as timestamps are in more subtle
> formats. The only canonical reference to these formats is the PostgreSQL
> source. 
>
> In general, you should just use the text format, as it's more portable
> and less likely to change over time, although the specifics of the text
> format are also largely undocumented.
>
> [1] http://en.wikipedia.org/wiki/Endianness
>
>   




Re: PQgetvalue failed to return column value for non-text data in binary format

От
Andrew McNamara
Дата:
>This is interesting position to be taken by PostgreSQL.  For simple 
>query, particularly with regards to reporting, one can go about using 
>text for all column.  But a lot of time, I deal with large dataset 
>ingestions and extracts that tends to take many hours to perform.  So I 
>usually prefer binary form data for performance reason.

The performance benefits of "binary" over "text" formats are often less
than you might think. In the real world, on-the-wire binary formats
often need parsing, just like text formats, and if they don't match your
internal binary format, still more conversion is required.

>Most of our company's works are done in Java nowadays, but for 
>large data ingestions/extract/conversion, a C program out performs Java 
>program fair and square, no debate there (based on actual run-time 
>data).  For clarification, we deal with various business partners that 
>we have to import/export large amount of data between heterogeneous 
>databases that the wonderful technology of database replication does not 
>apply for our use.

I regularly import and export millions of rows between python and
PostgreSQL - my experience is that the TCP connection to the database,
and the database itself (disk I/O) is often the limiting factor, rather
than my choice of language. Where python is the limiting factor, it's
typically the instantiation of objects representing row values that
dominates, and that would be true whether a text or binary format was
used. The remaining cases are where the processing is non-trivial, and
in those cases, I would much rather be working in a high level language
where bugs are easier to identify. It's rare indeed that I resort to
coding it in C.

>I can get by using the socket function, ntohl() and ntohs() for smallint 
>and int4 for the conversion.  But there is no such thing for the float4 
>and float8 as well as the date, timestamp you mentioned.  I wish there 
>is such function or macro available for the C interface. 

So do I. Serialising floats and doubles is a common enough activity,
and it defies belief that the standard C libraries don't provide a way
to do it.  Having every application that wants to pass floats over the
net implement it's own scheme is just asking for bugs.

I would suggest looking at the PostgreSQL source for one example of how
to do floats. Timestamps are actually composite types made up of other
primitive binary types. The only gotcha with the time types is that they
can use ints or floats, depending on how the server was built. When
using protocol 3 with a version 8 server, there is a parameter that
informs you which was chosen.

>I do not agree with the "ease of portability" issue, because if that is 
>the case, one can always give up performance and use Java for that 
>purpose.  Beside, it should not be that much more difficult to provide 
>win32 exe for PostgreSQL with PQgetvalue() that return int2, int4, 
>float4, float8, date, datetime/timestamp, etc.

Note that you're preaching to the choir - I'm just a PG user, like you,
and I find the "binary" situation less than optimal. I wouldn't have
used it if I could work out how to pass arrays as text parameters (but
now that I know how the binary types work, I suspect I could solve the
text array problem by looking at the PG source).

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/


Re: PQgetvalue failed to return column value for non-text data in binary format

От
"Jeroen T. Vermeulen"
Дата:
On Tue, May 22, 2007 04:46, Jeff Lynn wrote:

> This is interesting position to be taken by PostgreSQL.  For simple
> query, particularly with regards to reporting, one can go about using
> text for all column.  But a lot of time, I deal with large dataset
> ingestions and extracts that tends to take many hours to perform.  So I
> usually prefer binary form data for performance reason.  To that end, I
> found there is a significant different in the time it takes to ingest or
> extract data from one database to another database of totally different
> vendor.  Most of our company's works are done in Java nowadays, but for
> large data ingestions/extract/conversion, a C program out performs Java
> program fair and square, no debate there (based on actual run-time
> data).  For clarification, we deal with various business partners that
> we have to import/export large amount of data between heterogeneous
> databases that the wonderful technology of database replication does not
> apply for our use.

For that kind of thing, may I suggest you try using the COPY API?  That
may buy you more speed than transferring data in binary form
would—particularly for insertions.


Jeroen