MAP syntax for arrays
От | Ildar Musin |
---|---|
Тема | MAP syntax for arrays |
Дата | |
Msg-id | f86e5d8b-20a7-36ce-acd6-ac58b290c8f0@postgrespro.ru обсуждение исходный текст |
Ответы |
Re: MAP syntax for arrays
|
Список | pgsql-hackers |
Hello hackers, Recently I was working with sql arrays in postgres and it turned out that postgres doesn't have such very convinient functional constructions as map, reduce and filter. Currently to map function over array user has to make a subquery like: select u.* from my_table, lateral ( select array_agg(lower(elem)) from unnest(arr) as elem ) as u; Which is not only inconvenient but not very efficient as well (see 'Demo' section below). When I dug into the code I found that postgres already has the needed infrastructure for implementing map for arrays; actually array coercing already works that way (it basically maps cast function). In the attached patch there is a simple map implementation which introduces new expression type and syntax: MAP(<func_name> OVER <array_expression>) For example: SELECT MAP(upper OVER array['one', 'two', 'three']::text[]); ?column? ----------------- {ONE,TWO,THREE} (1 row) This is probably not the most useful notation and it would be better to have syntax for mapping arbitrary expressions over array, not just function. I'm struggling to come up with a good idea of how it should look like. It could look something like following: MAP(<expr> FOR <placeholder> IN <array_expressin>) For instance: SELECT MAP(x*2 FOR x IN array[1, 2, 3]::int[]); Looking forward for community's suggestions! Demo ---- Here is a small comparison between map and unnest/aggregate ways for per-element processing of arrays. Given a table with 1K rows which contains single column of text[] type. Each array contains 5/10/100 elements. create table my_table (arr text[]); insert into my_table select array_agg(md5(random()::text)) from generate_series(1, 1000) as rows, generate_series(1, 10) as elements group by rows; There are two scripts for pgbench. One for 'map' syntax: select map(upper over arr) from my_table; And one for unnest/aggregate: select u.* from my_table, lateral ( select array_agg(upper(elem)) from unnest(arr) as elem ) as u; Results are: elements per array | map (tps) | unnest/aggregate (tps) --------------------+------------+------------------------ 5 | 139.105359 | 74.434010 10 | 74.089743 | 43.622554 100 | 7.693000 | 5.325805 Apparently map is more efficient for small arrays. And as the size of array increases the difference decreases. I'll be glad to any input from the community. Thanks! -- Ildar Musin i.musin@postgrespro.ru
Вложения
В списке pgsql-hackers по дате отправления: