Re: vector search support
От | Jonathan S. Katz |
---|---|
Тема | Re: vector search support |
Дата | |
Msg-id | e083ced8-83a0-9b73-156b-da968b83ac9c@postgresql.org обсуждение исходный текст |
Ответ на | Re: vector search support (Oliver Rice <oliver@oliverrice.com>) |
Список | pgsql-hackers |
On 5/25/23 1:48 PM, Oliver Rice wrote: > A nice side effect of using the float8[] to represent vectors is that it > allows for vectors of different sizes to coexist in the same column. > > We most frequently see (pgvector) vector columns being used for storing > ML embeddings. Given that different models produce embeddings with a > different number of dimensions, the need to specify a vector’s size in > DDL tightly couples the schema to a single model. Support for variable > length vectors would be a great way to decouple those concepts. It would > also be a differentiating feature from existing vector stores. I hadn't thought of that, given most of what I've seen (or at least my personal bias in designing systems) is you keep a vector of one dimensionality in a column. But this sounds like where having native support in a variable array would help. > One drawback is that variable length vectors complicates indexing for > similarity search because similarity measures require vectors of > consistent length. Partial indexes are a possible solution to that challenge Yeah, that presents a challenge. This may also be an argument for a vector data type, since that would eliminate the need to check for consistent dimensionality on the indexing. Jonathan
Вложения
В списке pgsql-hackers по дате отправления: