Re: CREATE TEXT SEARCH DICTIONARY segfaulting on 9.6+
От | Tomas Vondra |
---|---|
Тема | Re: CREATE TEXT SEARCH DICTIONARY segfaulting on 9.6+ |
Дата | |
Msg-id | 20191013213808.wfhtvkqcvlvbdkmz@development обсуждение исходный текст |
Ответ на | CREATE TEXT SEARCH DICTIONARY segfaulting on 9.6+ (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Список | pgsql-hackers |
I spent a bit of time investigating this, and it seems the new code is somewhat too trusting when it comes to data from the affix/dict files. In this particular case, it boils down to this code in NISortDictionary: if (Conf->useFlagAliases) { for (i = 0; i < Conf->nspell; i++) { char *end; if (*Conf->Spell[i]->p.flag != '\0') { curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10); if (Conf->Spell[i]->p.flag == end || errno == ERANGE) ereport(ERROR, (errcode(ERRCODE_CONFIG_FILE_ERROR), errmsg("invalid affix alias \"%s\"", Conf->Spell[i]->p.flag))); } ... Conf->Spell[i]->p.d.affix = curaffix; ... } ... } So it simply grabs whatever it finds in the dict file, parses it and then (later) we use it as index to access the AffixData array, even if the value is way out of bounds. For example in the example, hunspell_sample_long.affix contains about 10 affixes, but then we parse the hunspell_sample_num.dict file, and we stumble upon book/302,301,202,303 and we parse the flags as integers, and interpret them as indexes in the AffixData array. Clearly, 303 is waaaay out of bounds, triggering the segfault crash. So I think we need some sort of cross-check here. We certainly need to make NISortDictionary() check the affix value is within AffixData bounds, and error out when the index is non-sensical (maybe negative and/or exceeding nAffixData). Maybe there's a simple way to check if the affix/dict files match. The failing affix has FLAG num while with FLAG long it works just fine. But I'm not sure that's actually possible, because I don't see anything in hunspell_sample_num.dict that would allow us to decide that it expects "FLAG num" and not "FLAG long". Furthermore, we certainly can't rely on this - we still need to check the range. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: