Re: Implementation of SASLprep for SCRAM-SHA-256
От | Heikki Linnakangas |
---|---|
Тема | Re: Implementation of SASLprep for SCRAM-SHA-256 |
Дата | |
Msg-id | bcdd548d-04ce-69a2-1328-29627104d212@iki.fi обсуждение исходный текст |
Ответ на | Re: Implementation of SASLprep for SCRAM-SHA-256 (Michael Paquier <michael.paquier@gmail.com>) |
Список | pgsql-hackers |
On 04/05/2017 07:23 AM, Michael Paquier wrote: > fore > > On Wed, Apr 5, 2017 at 7:05 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> I will continue tomorrow, but I wanted to report on what I've done so far. >> Attached is a new patch version, quite heavily modified. Notable changes so >> far: > > Great, thanks! > >> * Use Unicode codepoints, rather than UTF-8 bytes packed in a 32-bit ints. >> IMHO this makes the tables easier to read (to a human), and they are also >> packed slightly more tightly (see next two points), as you can fit more >> codepoints in a 16-bit integer. > > Using directly codepoints is not much consistent with the existing > backend, but for the sake of packing things more, OK. Oh, I see, we already have similar functions in wchar.c. unicode_to_utf8() and utf8_to_unicode(). We should probably move those to src/common, rather than re-invent the wheel. > pg_utf8_islegal() and pg_utf_mblen() should as well be moved in their > own file I think, and wchar.c can use that. Yeah.. >> * The list of characters excluded from recomposition is currently hard-coded >> in utf_norm_generate.pl. However, that list is available in machine-readable >> format, in file CompositionExclusions.txt. Since we're reading most of the >> data from UnicodeData.txt, would be good to read the exclusion table from a >> file, too. > > Ouch. Those are present here... > http://www.unicode.org/reports/tr41/tr41-19.html#Exclusions > Definitely it makes more sense to read them from a file. Did that. >> * SASLPrep specifies normalization form KC, but it also specifies that some >> characters are mapped to space or nothing. Should do those mappings, too. > > Ah, right. Those ones are here: > https://tools.ietf.org/html/rfc3454#appendix-B.1 Yep. Attached is a new version. Notable changes since yesterday: * Implemented the rest of the SASLPrep, mapping some characters to spaces, leaving out others, and checking for prohibited characters and bidirectional strings. * Moved things around. There's now a separate directory, src/common/unicode, which contains the perl scripts and the test code. Those are not needed to build from source, as the pre-generated tables are put in src/include/common. Similar to the scripts in src/backend/utils/mb/Unicode, really. * Renamed many things from utf_* to unicode_*, since they don't deal with utf-8 input anymore. This is starting to shape up, but still some cleanup work to do. I will continue tomorrow.. - Heikki
Вложения
В списке pgsql-hackers по дате отправления: