Re: Proposal to introduce a shuffle function to intarray extension

Поиск

Список

Период

Сортировка

От	Martin Kalcher
Тема	Re: Proposal to introduce a shuffle function to intarray extension
Дата	16 июля 2022 г. 20:21:54
Msg-id	7933c4a5-0e72-889e-1be9-e3574ffe333b@aboutsource.net обсуждение исходный текст
Ответ на	Re: Proposal to introduce a shuffle function to intarray extension (Mladen Gogala <gogala.mladen@gmail.com>)
Ответы	Re: Proposal to introduce a shuffle function to intarray extension
Список	pgsql-general

Дерево обсуждения

Am 16.07.22 um 18:53 schrieb Mladen Gogala:
> On 7/15/22 04:36, Martin Kalcher wrote:
>> Dear list,
>>
>> i am dealing with an application that processes fairly large arrays of
>> integers. It makes heavy use of the intarray extension, which works
>> great in most cases. However, there are two requirements that cannot
>> be addressed by the extension and are rather slow with plain SQL. Both
>> can be met with shuffling:
>>
>> - Taking n random members from an integer array
>> - Splitting an array into n chunks, where each member is assigned to a
>> random chunk
>>
>> Shuffling is currently implemented by unnesting the array, ordering
>> the members by random() and aggregating them again.
> 
> 
> Martin, have you considered PL/Python and NumPy module?

Hey Mladen,

thank you for your advice. Unfortunately the performance of shuffling 
with NumPy is about the same as with SQL.

   create function numpy_shuffle(arr int[])
   returns int[]
   as $$
     import numpy
     numpy.random.shuffle(arr)
     return arr
   $$ language 'plpython3u';

   select arr[1:3]::text || ' ... ' || arr[3999998:4000000]::text
   from (
     select numpy_shuffle(arr) arr from numbers
   ) shuffled;

   -------------------------------------------------------
    {674026,3306457,1727170} ... {343875,3825484,1235246}

   Time: 2315.431 ms (00:02.315)

Am i doing something wrong?

Martin

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Proposal to introduce a shuffle function to intarray extension