Обсуждение: PQgetvalue failed to return column value for non-text data in binary format
Hi, I am new to PostgreSQL and tried using the libpq C library. I can read a test table using PQgetvalue for text format, i.e. format = 0. However, when I experiment with using binary output format, I was unable to read anything else except result set fiend that is in char(x), varchar(x), i.e., character string data. What have I done wrong? int * pId; LPSTR pszName; double * pSalary; PGresult *pResult = PQexecParams(m_pConn, (LPCTSTR)strSelect, nParams, parmTypes, paramValues, parmLengths,parmFormats, 1); pId = (int *)PQgetvalue(pResult, row, 0); pszName = (LPSTR)PQgetvalue(pResult, row, 1); pSalary = (double *)PQgetvalue(pResult,row, 6); printf("id: %d name: %s salary: %f", *pId, pszName, *pSalary); What I got is as follow: + + + printing resultset, 2 rows, 8 columns id: 167772160 name: Amy Who salary: 0.000000 id: 16777216 name: David Smith salary: 0.000000 Which supposed to return: id: 10 name: Amy Who salary: 150000.00 id: 1 name: David Smith salary: 350000.00 Any help is greatly appreciated! Jeff
Re: PQgetvalue failed to return column value for non-text data in binary format
От
Andrew McNamara
Дата:
>I am new to PostgreSQL and tried using the libpq C library. I can read >a test table using PQgetvalue for text format, i.e. format = 0. >However, when I experiment with using binary output format, I was unable >to read anything else except result set fiend that is in char(x), >varchar(x), i.e., character string data. > >What have I done wrong? The data returned in binary mode is in a (largely undocumented) PostgreSQL form. Types like int4, and float8 are returned in network byte order [1], while other types such as timestamps are in more subtle formats. The only canonical reference to these formats is the PostgreSQL source. In general, you should just use the text format, as it's more portable and less likely to change over time, although the specifics of the text format are also largely undocumented. [1] http://en.wikipedia.org/wiki/Endianness -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/
Thanks you very much for your help and clarification. As programmer in heart, one tends to go straight the specific api. I guess once one become so comfortable with a particular open source software, then one can go to the source to understand what is happening there. But in my case, I guess I am far from that level of sophistication in understanding PostgreSQL. This is interesting position to be taken by PostgreSQL. For simple query, particularly with regards to reporting, one can go about using text for all column. But a lot of time, I deal with large dataset ingestions and extracts that tends to take many hours to perform. So I usually prefer binary form data for performance reason. To that end, I found there is a significant different in the time it takes to ingest or extract data from one database to another database of totally different vendor. Most of our company's works are done in Java nowadays, but for large data ingestions/extract/conversion, a C program out performs Java program fair and square, no debate there (based on actual run-time data). For clarification, we deal with various business partners that we have to import/export large amount of data between heterogeneous databases that the wonderful technology of database replication does not apply for our use. For this kind of operations, I even found the performance gain by creating C program out-weight a Java program for that same purpose. I can get by using the socket function, ntohl() and ntohs() for smallint and int4 for the conversion. But there is no such thing for the float4 and float8 as well as the date, timestamp you mentioned. I wish there is such function or macro available for the C interface. I do not agree with the "ease of portability" issue, because if that is the case, one can always give up performance and use Java for that purpose. Beside, it should not be that much more difficult to provide win32 exe for PostgreSQL with PQgetvalue() that return int2, int4, float4, float8, date, datetime/timestamp, etc. Jeff, Andrew McNamara wrote: > The data returned in binary mode is in a (largely undocumented) > PostgreSQL form. Types like int4, and float8 are returned in network > byte order [1], while other types such as timestamps are in more subtle > formats. The only canonical reference to these formats is the PostgreSQL > source. > > In general, you should just use the text format, as it's more portable > and less likely to change over time, although the specifics of the text > format are also largely undocumented. > > [1] http://en.wikipedia.org/wiki/Endianness > >
Re: PQgetvalue failed to return column value for non-text data in binary format
От
Andrew McNamara
Дата:
>This is interesting position to be taken by PostgreSQL. For simple >query, particularly with regards to reporting, one can go about using >text for all column. But a lot of time, I deal with large dataset >ingestions and extracts that tends to take many hours to perform. So I >usually prefer binary form data for performance reason. The performance benefits of "binary" over "text" formats are often less than you might think. In the real world, on-the-wire binary formats often need parsing, just like text formats, and if they don't match your internal binary format, still more conversion is required. >Most of our company's works are done in Java nowadays, but for >large data ingestions/extract/conversion, a C program out performs Java >program fair and square, no debate there (based on actual run-time >data). For clarification, we deal with various business partners that >we have to import/export large amount of data between heterogeneous >databases that the wonderful technology of database replication does not >apply for our use. I regularly import and export millions of rows between python and PostgreSQL - my experience is that the TCP connection to the database, and the database itself (disk I/O) is often the limiting factor, rather than my choice of language. Where python is the limiting factor, it's typically the instantiation of objects representing row values that dominates, and that would be true whether a text or binary format was used. The remaining cases are where the processing is non-trivial, and in those cases, I would much rather be working in a high level language where bugs are easier to identify. It's rare indeed that I resort to coding it in C. >I can get by using the socket function, ntohl() and ntohs() for smallint >and int4 for the conversion. But there is no such thing for the float4 >and float8 as well as the date, timestamp you mentioned. I wish there >is such function or macro available for the C interface. So do I. Serialising floats and doubles is a common enough activity, and it defies belief that the standard C libraries don't provide a way to do it. Having every application that wants to pass floats over the net implement it's own scheme is just asking for bugs. I would suggest looking at the PostgreSQL source for one example of how to do floats. Timestamps are actually composite types made up of other primitive binary types. The only gotcha with the time types is that they can use ints or floats, depending on how the server was built. When using protocol 3 with a version 8 server, there is a parameter that informs you which was chosen. >I do not agree with the "ease of portability" issue, because if that is >the case, one can always give up performance and use Java for that >purpose. Beside, it should not be that much more difficult to provide >win32 exe for PostgreSQL with PQgetvalue() that return int2, int4, >float4, float8, date, datetime/timestamp, etc. Note that you're preaching to the choir - I'm just a PG user, like you, and I find the "binary" situation less than optimal. I wouldn't have used it if I could work out how to pass arrays as text parameters (but now that I know how the binary types work, I suspect I could solve the text array problem by looking at the PG source). -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/
Re: PQgetvalue failed to return column value for non-text data in binary format
От
"Jeroen T. Vermeulen"
Дата:
On Tue, May 22, 2007 04:46, Jeff Lynn wrote: > This is interesting position to be taken by PostgreSQL. For simple > query, particularly with regards to reporting, one can go about using > text for all column. But a lot of time, I deal with large dataset > ingestions and extracts that tends to take many hours to perform. So I > usually prefer binary form data for performance reason. To that end, I > found there is a significant different in the time it takes to ingest or > extract data from one database to another database of totally different > vendor. Most of our company's works are done in Java nowadays, but for > large data ingestions/extract/conversion, a C program out performs Java > program fair and square, no debate there (based on actual run-time > data). For clarification, we deal with various business partners that > we have to import/export large amount of data between heterogeneous > databases that the wonderful technology of database replication does not > apply for our use. For that kind of thing, may I suggest you try using the COPY API? That may buy you more speed than transferring data in binary form wouldparticularly for insertions. Jeroen