Per-column collation, proof of concept
От | Peter Eisentraut |
---|---|
Тема | Per-column collation, proof of concept |
Дата | |
Msg-id | 1279045531.32647.14.camel@vanquo.pezone.net обсуждение исходный текст |
Ответы |
Re: Per-column collation, proof of concept
Re: Per-column collation, proof of concept Re: Per-column collation, proof of concept |
Список | pgsql-hackers |
Here is a proof of concept for per-column collation support. Here is how it works: When creating a table, an optional COLLATE clause can specify a collation name, which is stored (by OID) in pg_attribute. This becomes part of the type information and is propagated through the expression parse analysis, like typmod. When an operator or function call is parsed (transformed), the collations of the arguments are unified, using some rules (like type analysis, but different in detail). The collations of the function/operator arguments come either from Var nodes which in turn got them from pg_attribute, or from other function and operator calls, or you can override them with explicit COLLATE clauses (not yet implemented, but will work a bit like RelabelType). At the end, each function or operator call gets one collation to use. The function call itself can then look up the collation using the fcinfo->flinfo->fn_expr field. (Works for operator calls, but doesn't work for sort operations, needs more thought.) A collation is in this implementation defined as an lc_collate string and an lc_ctype string. The implementation of functions interested in that information, such as comparison operators, or upper and lower functions, will take the collation OID that is passed in, look up the locale string, and use the xlocale.h interface (newlocale(), strcoll_l()) to compute the result. (Note that the xlocale stuff is only 10 or so lines in this patch. It should be feasible to allow other appropriate locale libraries to be used.) Loose ends: - Support function calls (currently only operator calls) (easy) - Implementation of sort clauses - Indexing support/integration - Domain support (should be straightforward) - Make all expression node types deal with collation information appropriately - Explicit COLLATE clause on expressions - Caching and not leaking memory of locale lookups - I have typcollatable to mark which types can accept collation information, but perhaps there should also be proicareaboutcollation to skip collation resolution when none of the functions in the expression tree care. You can start by reading the collate.sql regression test file to see what it can do. Btw., regression tests only work with "make check MULTIBYTE=UTF8". And it (probably) only works with glibc for now.
Вложения
В списке pgsql-hackers по дате отправления: