Обсуждение: tinyint and type problems

Поиск
Список
Период
Сортировка

tinyint and type problems

От
Shachar Shemesh
Дата:
Hi all,

I have sent a patch to implement tinyint, an unsigned one byte integer, 
for MS SQL compatibility. The replies I got were under two categories. 
One was "our type system is complicated enough", and the other was "it 
should be signed and we should have a general "unsigned" keyword.

Now, I'm not against neither simplifying the type system nor having a 
"unsigned" keyword. The thing is that between these two remarks, my 
patch was not applied and I don't know why.

I need this new type because the OLE DB provider I'm writing (for a 
client) works in binary mode by-definition, which means that the 
client's software, currently designed to work with MS SQL, relies on 
certain table entries to be one byte wide. Tricks such as using "char" 
don't work, as it uses "insert ... from select... ", which fails because 
the numbers are not translateable to "char". It also does explicit 
"insert", which also will not work with "char".

I would really love to know what needs to be done in order to get a one 
byte type, either signed or not (not really important at this stage) 
into postgres.

If this is impossible, would it be at least possible to reserve an OID 
for this type, and decide what it actually is later? If that would be 
possible, I can go on, in the mean time, with my development. The 
problem is that pglib really has no way of identifying the types except 
by OID.
            Shachar

-- 
Shachar Shemesh
Lingnu Open Systems Consulting
http://www.lingnu.com/



Re: tinyint and type problems

От
Tom Lane
Дата:
Shachar Shemesh <psql@shemesh.biz> writes:
> I have sent a patch to implement tinyint, an unsigned one byte integer, 
> ...
> Now, I'm not against neither simplifying the type system nor having a 
> "unsigned" keyword. The thing is that between these two remarks, my 
> patch was not applied and I don't know why.

> I need this new type because ...

Um, the reason we have an extensible type system is so that people can
make their own datatypes.  You don't have to get a type accepted into
the base system in order to use it yourself.  There wasn't any part of
your patch that couldn't have been implemented like a contrib module
(i.e., a dynamically loaded library for the C functions plus a SQL
script to install the catalog entries).

The criterion for adding new types to the base system is really "is this
of sufficient general usefulness to justify taking on a permanent
maintenance load?" ... where the size of the maintenance load depends
quite a lot on how tightly bound the new type is to the rest of the
system.  A new type that is part of the numeric hierarchy is going to be
way up there on the scale of maintenance burden.  For instance, people
would expect it to have at least as much functionality as int2 does,
such as indexing support, and so we'd be buying into (sooner or later)
doing a lot of work that your patch left unfinished.

On top of that there are some concerns about locking ourselves into
supporting a definition we might not like later.  Most obviously there
is the signed/unsigned issue, but that's not the only risk.  We are still
tweaking the numeric hierarchy to try to get the interrelationships of
the types just right.  Until we are really satisfied that that task
is done, I'm not excited about adding more functionality that might look
like a big backwards-compatibility problem later.

So the bottom line for me is that the potential downsides outweigh
the upside.  You haven't made a good enough case for the general
usefulness of this datatype.
        regards, tom lane


Re: tinyint and type problems

От
Shachar Shemesh
Дата:
Hi Tom,

Tom Lane wrote:

>>I need this new type because ...
>>    
>>
>
>Um, the reason we have an extensible type system is so that people can
>make their own datatypes.  You don't have to get a type accepted into
>the base system in order to use it yourself.
>  
>
...

>The criterion for adding new types to the base system is really "is this
>of sufficient general usefulness to justify taking on a permanent
>maintenance load?"
>
Discussions raising excellent points on why the burden is higher than 
the rewards snipped.

Ok, I'll explain why I didn't go the external datatype to begin with, 
and then suggest a compromise for discussion.

There is one difference between builtin datatypes and external datatypes 
that is usually unimportant, but is making my case extremely difficult. 
Builtin types have guarenteed OID number. This makes knowing what you 
got through switch...case over the return type of |PQftype that much 
easier. I can probably work around it by doing a select over pg_types, 
but I'm afraid of the performance penalties. If left with no choice, 
that's what I'll do.|

Since this is not some wierd type that noone has ever heard of, but a 
type that is, under one variation or another, available in any database 
system, I though it would be generally useful and thus wrote a patch for 
it to the main database.

The way I see it, there are three options. I'll list them in the order I 
prefer them to happen, and I ask that you let me know which one the 
group thinks is best.
Choice 1 - I implement the missing features my previous patch did not 
take care of. These include arithmetic operators, index support, and 
casts to all other numeric types (something which MS SQL, by the way, 
makes do without. There is no cast from tinyint to int8 there, but so be 
it). I actually suggested this on the patches list before, but I asked 
that an indication be given that these are the only restraints from 
putting the type in, so that my time is not spent in vain. Since I got 
no reply there, I started this thread here.

Choice 2 - We put in just the macros for manipulating 8 bit parameters 
from my patch, and do one more thing. I suggest that PostgreSQL should 
have, in one form or another, a one byte numerical value, and so all I 
ask is that we decide now what OID it should have. I can then go on with 
implementing everything else as an external type, and everybody is 
happy. When our type system improves, we can then resolve the other 
questions (signed unsigned etc.), and I can pull my external type.

Choice 3 - We do nothing of the above. I put everything into an external 
type lib, and work around my OLE DB driver using additional queries. I 
don't like this option because it means that PostgreSQL's OLE DB driver 
will have reduced performance.

I'll stress again - I don't mind doing all the work associated with any 
once of the above choices. All I'm asking is that we agree on which one 
will be best for this project. As far as I'm concerned, Choice 2 
involves the least amount of work, but I think Choice 1 will serve 
everyone better.

As for the signness/unsigness dillema - I found a table at 
http://www.theopensourcery.com/sqldatatypes.htm. It compares MS SQL, 
Oracle, Postgres and MySQL. Of the four, MS SQL and MySql have one byte 
int. In MS SQL, it is unsigned, while the entire rest of the type system 
is signed. In MySQL, all numbers are signed unless prepended with an 
"unsigned" modifier.

Personally, I don't think the MySQL system is worth the trouble. This is 
a database system, not a C programming language. People who need the 
extra range can use the NUMBER type. I also don't see the use for a 
signed 1 byte type - its range is too small in both direction.
            Shachar

-- 
Shachar Shemesh
Lingnu Open Systems Consulting
http://www.lingnu.com/



Re: tinyint and type problems

От
Joe Conway
Дата:
Shachar Shemesh wrote:
> I'll stress again - I don't mind doing all the work associated with any 
> once of the above choices. All I'm asking is that we agree on which one 
> will be best for this project. As far as I'm concerned, Choice 2 
> involves the least amount of work, but I think Choice 1 will serve 
> everyone better.

Sorry for jumping into this thread so late, but if you don't really need  arithmetic operators, index support, and
caststo all other numeric 
 
types, why not use "char"?

Joe


Re: tinyint and type problems

От
Shachar Shemesh
Дата:
Joe Conway wrote:

> Shachar Shemesh wrote:
>
>> I'll stress again - I don't mind doing all the work associated with 
>> any once of the above choices. All I'm asking is that we agree on 
>> which one will be best for this project. As far as I'm concerned, 
>> Choice 2 involves the least amount of work, but I think Choice 1 will 
>> serve everyone better.
>
>
> Sorry for jumping into this thread so late, but if you don't really 
> need  arithmetic operators, index support, and casts to all other 
> numeric types, why not use "char"?
>
> Joe

On a totally unrelated issue - how are the quotes around "char" 
enforced? I have not seen anything special about it's definition.

Hi Joe,

Thanks for the offer. I actually tried to use "char" before embarking on 
the quest, but it didn't work out. While my specific client (and I 
realize that others do need it) don't need many of the above, they do 
need some of it. As such, the lack of ability to type "insert into foo 
(char_col) values (5)" meant I couldn't use it.

In any case, I now have (already commited to the OLE DB sources) a 
solution to this problem. The provider has a list of special types it 
supports. When first establishing a connection to the database, I query 
the pg_type table for all of those types, and cache their OIDs. When I 
need to perform parsing, I look for all the OIDs I know in advance using 
a "switch", and in the default clause, look for all the OIDs of the non 
preknown types. It's a reasonable solution, I think.

I would still love it if the definitions for PG_GETARG_UINT8, 
PG_RETURN_UINT8 and DatumGetInt8 made it into the main tree, as I don't 
think there is any real reason to define them in my so. That aside, 
however, my specific problem is solved. I also renamed the type 
"utinyint", to leave room for a signed 1 byte value if we decide to go 
that route. 
(http://gborg.postgresql.org/project/oledb/cvs/co.php/oledb/PgSess.cpp?r=1.1).

As far as I'm concerned, the discussion just turned academic. I'll still 
love to add the missing feature to get this type in, as I do think it's 
generally useful. My implementation, however, currently works over a 
vanilla postgres without a problem.
            Shachar

-- 
Shachar Shemesh
Lingnu Open Systems Consulting
http://www.lingnu.com/