Re: Extract numeric filed in JSONB more effectively

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: Extract numeric filed in JSONB more effectively
Дата
Msg-id CAFj8pRAO3oEiBaJJ9=HZp6CoP2ffbwSgrKkLKjPfYZwx9wOOuQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Extract numeric filed in JSONB more effectively  (Andy Fan <zhihui.fan1213@gmail.com>)
Ответы Re: Extract numeric filed in JSONB more effectively  (Andy Fan <zhihui.fan1213@gmail.com>)
Список pgsql-hackers
Hi

čt 3. 8. 2023 v 9:53 odesílatel Andy Fan <zhihui.fan1213@gmail.com> napsal:
Hi Pavel:

Thanks for the feedback. 

I don't like this solution because it is bloating  operators and it is not extra readable.
 
If we support it with cast, could we say we are bloating CAST?  It is true
that it is not extra readable, if so how about  a->>'a'  return text?  Actually
I can't guess any meaning of the existing jsonb operations without
documentation.

yes, it can bloat CAST, but for usage we have already used syntax, and these casts are cooked already:

(2023-08-03 11:04:51) postgres=# select castfunc::regprocedure from pg_cast where castsource = 'jsonb'::regtype;
┌──────────────────┐
│     castfunc     │
╞══════════════════╡
│ -                │
│ bool(jsonb)      │
│ "numeric"(jsonb) │
│ int2(jsonb)      │
│ int4(jsonb)      │
│ int8(jsonb)      │
│ float4(jsonb)    │
│ float8(jsonb)    │
└──────────────────┘
(8 rows)

 
the operator ->> was a special case, the text type is special in postgres as the most convertible type. And when you want to visualise a value or display the value, you should convert value to text.

I can live with that because it is just one, but with your proposal opening the doors for implementing tens of similar operators, I think it is bad. Using ::target_type is common syntax and doesn't require reading documentation.

More, I believe so lot of people uses more common syntax, and then this syntax should to have good performance - for jsonb - (val->'op')::numeric works, and then there should not be performance penalty, because this syntax will be used in 99%.

Usage of cast is self documented.


For completeness you should implement cast for date, int, boolean too. Next, the same problem is with XML or hstore type (probably with any types that are containers).

I am not sure completeness is a gold rule we should obey anytime,
like we have some function like int24le to avoid the unnecessary
cast, but we just avoid casting for special types for performance
reason, but not for all. At the same time,  `int2/int4/int8` doesn't
have a binary compatibility type in jsonb. and the serialization
/deserialization for boolean is pretty cheap.

I didn't realize timetime types are binary compatible with SQL,  
so maybe we can have some similar optimization as well. 
(It is a pity that timestamp(tz) are not binary, or else we may 
just need one operator). 
 

I don't like the idea so using a special operator is 2x faster than common syntax for casting. It is a signal, so there is a space for optimization. Black magic with special operators is not user friendly for relatively common problems.

I don't think "Black magic" is a proper word here, since it is not much
different from ->> return a text.  If you argue text can be cast to 
most-of-types,  that would be a reason, but I doubt this difference
should generate a "black magic". 

I used the term black magic, because nobody without reading documentation can find this operator. It is used just for this special case, and the functionality is the same as using cast (only with different performance).

The operator ->> is more widely used. But if we have some possibility to work without it, then the usage for a lot of users will be more simple. More if the target types can be based on context

Can be nice to use some like `EXTRACT(YEAR FROM val->'field')` instead `EXTRACT(YEAR FROM (val->>'field')::date)`

 

Maybe we can introduce some *internal operator* "extract to type", and in rewrite stage we can the pattern (x->'field')::type transform to OP(x, 'field', typid)

Not sure what the OP should be?  If it is a function, what is the
return value?  It looks to me like it is hard to do in c language?

It should be internal structure - it can be similar like COALESCE or IS operator
 

After all,  if we really care about the number of operators, I'm OK
with just let users use the function directly, like

jsonb_field_as_numeric(jsonb, 'filedname') 
jsonb_field_as_timestamp(jsonb, 'filedname'); 
jsonb_field_as_timestamptz(jsonb, 'filedname'); 
jsonb_field_as_date(jsonb, 'filedname'); 

it can save an operator and sloves the readable issue. 

I don't like it too much, but it is better than introduction new operator 

We already have the jsonb_extract_path and jsonb_extract_path_text function.

I can imagine to usage "anyelement" type too. some like `jsonb_extract_path_type(jsonb, anyelement, variadic text[] )`






--
Best Regards
Andy Fan

В списке pgsql-hackers по дате отправления:

Предыдущее
От: tender wang
Дата:
Сообщение: Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Следующее
От: Matthias van de Meent
Дата:
Сообщение: Re: Extract numeric filed in JSONB more effectively