Re: GiST support for inet datatypes

Поиск

Список

Период

Сортировка

От	Andreas Karlsson
Тема	Re: GiST support for inet datatypes
Дата	19 января 2014 г. 23:17:08
Msg-id	52DC5CF0.6070908@proxel.se обсуждение исходный текст
Ответ на	Re: GiST support for inet datatypes (Andreas Karlsson <andreas@proxel.se>)
Ответы	Re: GiST support for inet datatypes
Список	pgsql-hackers

Дерево обсуждения

Here comes part 2 of 2 of the review.

inet-gist
---------

In general the code looks good and I think your approach makes sense. 
Not an expert on GiST though so I would like a second opinion on this. 
Maybe there is some clever trick which would make the index more 
efficient, but the design I see is simple and logical. Like this much 
more than the incorrect GiST index for inet in btree_gist.

The only thing is that I think your index would do poorly on the case 
where all values share a prefix before the netmask but have different 
values after the netmask, since gist_union and gist_penalty does not 
care about the bits after the netmask. Am I correct? Have you thought 
anything about this?

To create such data:

CREATE TABLE a (ip inet);
INSERT INTO a SELECT '127.0.0.0/16'::inet + s FROM generate_series(1, 
pow(2, 16)::int) s;
CREATE INDEX aidx ON a USING gist (ip);

Saw also some minor things too.

Typo in comment, weird sentence:
 * The main question to calculate the union is that they have how many * bits in common. [...]

I do not like how you called the result key i inet_union_gist() "tmp". 
Something like "result" or "result_ip" would be better.

Typo in comment, "reverse" should be "inverse":
 * Penalty is reverse of the common IP bits of the two addresses.

Typo in comment, missing "be":
 * Values bigger than 1 are used when the common IP bits cannot * calculated.

Language can be improved in this comment. Both ways to split are by the 
union of the keys, it is more of a split by address prefix.
 * The second and the important way is to split by the union of the keys. * Union of the keys calculated same way with
theinet_gist_union function. * The first and the last biggest subnets created from the calculated * union. In this case
addressescontained by the first subnet will be put * to the left bucket, address contained by the last subnet will be
putto * the right bucket.
 

Could you not just use memcmp in inet_gist_same() instead of bitncmp() 
since it only cares about equality?

There is an extra empty line at the end of network_gist.c

inet-selfuncs
-------------

Here I have to honestly admit I am not sure if I can tell if your 
selectivity function is correct. Your explanation sounds convincing and 
the general idea of looking at the histogram is correct. I am just have 
no idea if the details are right.

DEFAULT_NETWORK_OVERLAP_SELECTIVITY should be renamed 
DEFAULT_NETWORK_OVERLAP_SEL and moved to the .c file.

Typo in comment, "constrant" -> "constant".

There is an extra empty line at the end of network_selfuncs.c

-- 
Andreas Karlsson

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: GiST support for inet datatypes