Re: custom function for converting human readable sizes to bytes

Поиск
Список
Период
Сортировка
От Vitaly Burovoy
Тема Re: custom function for converting human readable sizes to bytes
Дата
Msg-id CAKOSWN=32VE2BQ9D2YA=VwDKv7Y8pCgD5Ordnj+0rcit7v8coQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: custom function for converting human readable sizes to bytes  (Pavel Stehule <pavel.stehule@gmail.com>)
Ответы Re: custom function for converting human readable sizes to bytes  (Vitaly Burovoy <vitaly.burovoy@gmail.com>)
Re: custom function for converting human readable sizes to bytes  (Pavel Stehule <pavel.stehule@gmail.com>)
Список pgsql-hackers
On 1/4/16, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2016-01-04 18:13 GMT+01:00 Shulgin, Oleksandr <oleksandr.shulgin@zalando.de> :
>> On Mon, Jan 4, 2016 at 6:03 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> > 2016-01-04 17:48 GMT+01:00 Shulgin, Oleksandr <oleksandr.shulgin@zalando.de>:
>> >> On Mon, Jan 4, 2016 at 4:51 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> >>
>> >> I'm also inclined on dropping that explicit check for empty string
>> >> below and let numeric_in() error out on that.  Does this look OK, or can
>> >> it confuse someone:
>> >> postgres=# select pg_size_bytes('');
>> >> ERROR:  invalid input syntax for type numeric: ""
>> >
>> > both fixed
>>
>> Hm...
>>
>> > + switch (*strptr)
>> > + {
>> > + /* ignore plus symbol */
>> > + case '+':
>> > + case '-':
>> > + *bufptr++ = *strptr++;
>> > + break;
>> > + }
>>
>> Well, to that amount you don't need any special checks, I'm just not sure
>> if reported error message is not misleading if we let numeric_in() handle
>> all the errors.  At least it can cope with the leading spaces, +/- and
>> empty input quite well.
>>
>
> I don't would to catch a exception from numeric_in - so I try to solve some
> simple situations, where I can raise possible better error message.

There are several cases where your behavior gives strange errors (see below).

Next batch of notes:

src/include/catalog/pg_proc.h:
---
+ DATA(insert OID = 3317 ( pg_size_bytes...
now oid 3317 is used (by pg_stat_get_wal_receiver), 3318 is free

---
+ DESCR("convert a human readable text with size units to big int bytes");
May be the best way is to copy the first sentence from the doc?
("convert a size in human-readable format with size units into bytes")

====
src/backend/utils/adt/dbsize.c:
+ text           *arg = PG_GETARG_TEXT_PP(0);
+ char           *str = text_to_cstring(arg);
...
+     /* working buffer cannot be longer than original string */
+     buffer = (char *) palloc(VARSIZE_ANY_EXHDR(arg) + 1);
Is there any reason to get TEXT for only converting it to cstring and
call VARSIZE_ANY_EXHDR instead of strlen?

---
+     text           *arg = PG_GETARG_TEXT_PP(0);
+     char           *str = text_to_cstring(arg);
+     char    *strptr = str;
+     char           *buffer;
There are wrong offsets of variable names after their types (among all
body of the "pg_size_bytes" function).
See variable declarations in nearby functions ( >> "make the new code look like the existing code around it"
http://www.postgresql.org/docs/devel/static/source-format.html
)

---
+                      errmsg("\"%s\" is not number", str)));
s/is not number/is not a number/
(the second version can be found in a couple places besides translations)

---
+     if (*strptr != '\0')
...
+         while (*strptr && !isspace(*strptr))
Sometimes it explicitly compares to '\0', sometimes implicitly.
Common use is explicit comparison and it is preferred due to different
compilers (their conversions to boolean).

---
+     /* Skip leading spaces */
...
+         /* ignore plus symbol */
...
+     /* copy digits to working buffer */
...
+     /* allow whitespace between integer and unit */
I'm also inclined on dropping that explicit skipping spaces, checking
for +/- symbols, but copying all digits, spaces, dots and '+-' symbols
and let numeric_in() error out on that.

It allows to get correct error messages for something like:
postgres=# select pg_size_bytes('.+912');
ERROR:  invalid input syntax for type numeric: ".+912"
postgres=# select pg_size_bytes('+912+ kB');
ERROR:  invalid input syntax for type numeric: "+912+ "
postgres=# select pg_size_bytes('++123 kB');
ERROR:  invalid input syntax for type numeric: "++123 "

instead of current:
postgres=# select pg_size_bytes('.+912');
ERROR:  invalid input syntax for type numeric: "."
postgres=# select pg_size_bytes('+912+ kB');
ERROR:  invalid unit: "+ kB"
postgres=# select pg_size_bytes('++123 kB');
ERROR:  invalid input syntax for type numeric: "+"

---
+     while (isspace((unsigned char) *strptr))
...
+     while (isspace(*strptr))
...
+         while (*strptr && !isspace(*strptr))
...
+         while (isspace(*strptr))
The first occurece of isspace's parameter is casting to "unsigned
char" whereas the others are not.
Note:
"The behavior is undefined if the value of ch is not representable as
unsigned char and is not equal to EOF"
Proof:
http://en.cppreference.com/w/c/string/byte/isspace

---
+     pfree(buffer);
+     pfree(str);
pfree-s here are not necessary. See:
http://www.neilconway.org/talks/hacking/hack_slides.pdf (page 17)

-- 
Best regards,
Vitaly Burovoy



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: PATCH: postpone building buckets to the end of Hash (in HashJoin)