Re: CUBE_MAX_DIM
От | Alastair McKinley |
---|---|
Тема | Re: CUBE_MAX_DIM |
Дата | |
Msg-id | PR1PR02MB534067DDB48CCDC51456CB69E3920@PR1PR02MB5340.eurprd02.prod.outlook.com обсуждение исходный текст |
Ответ на | Re: CUBE_MAX_DIM (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
> From: Tom Lane <tgl@sss.pgh.pa.us> > Sent: 25 June 2020 17:43 > > Alastair McKinley <a.mckinley@analyticsengines.com> writes: > > I know that Cube in it's current form isn't suitable for nearest-neighbour searching these vectors in their raw form(I have tried recompilation with higher CUBE_MAX_DIM myself), but conceptually kNN GiST searches using Cubes can be usefulfor these applications. There are other pre-processing techniques that can be used to improved the speed of the search,but it still ends up with a kNN search in a high-ish dimensional space. > > Is there a way to fix the numerical instability involved? If we could do > that, then we'd definitely have a use-case justifying the work to make > cube toastable. I am not that familiar with the nature of the numerical instability, but it might be worth noting for additional contextthat for the NN use case: - The value of each dimension is likely to be between 0 and 1 - The L1 distance is meaningful for high numbers of dimensions, which *possibly* suffers less from the numeric issues thaneuclidean distance. The numerical stability isn't the only issue for high dimensional kNN, the GiST search performance currently degrades withincreasing N towards sequential scan performance, although maybe they are related? > regards, tom lane Best regards, Alastair
В списке pgsql-hackers по дате отправления: