Re: Indexing a Boolean or Null column?
От | D. Dante Lorenso |
---|---|
Тема | Re: Indexing a Boolean or Null column? |
Дата | |
Msg-id | 3FF7A395.4050106@lorenso.com обсуждение исходный текст |
Ответ на | Re: Indexing a Boolean or Null column? (Christopher Kings-Lynne <chriskl@familyhealth.com.au>) |
Ответы |
Re: Indexing a Boolean or Null column?
Re: Indexing a Boolean or Null column? |
Список | pgsql-performance |
Christopher Kings-Lynne wrote: > > Will an index on the 'data_is_active' column be used or work > > as I expect? I'm assuming that I may have a million entries > > sharing the same 'data_lookup_key' and I'll be using that to > > search for the active version of the row. > An index just on a boolean column won't be 'selective enough'. > eg. The index will only be able to choose 50% of the table - > since it's faster to do a full table scan in that case, the > index won't get used. Ok, so ...evenly distributed data on small set of values forces sequential scan since that's faster. I expected that based on what I've read so far. > A multi keyed index, however will work a bit better, eg an index > over (data_lookup_key, data_is_active). > > That way, the index will first be able to find the correct > key (which is nicely selective) and then will be able to > halve the resulting ? search space to get the active ones. I'm not using the 50% TRUE / 50% FALSE model. My model will be more like only ONE value IS TRUE for 'is_active' for each 'data_lookup_key' in my table. All the rest are FALSE. In this case for 100 rows all having the same 'data_lookup_key' we are looking at a 99% FALSE / 1% TRUE model ... and what I'll be searching for is the ONE TRUE. In this case, it WILL pay off to have the index on a boolean column, yes? Will I win my debate with my collegue? ;-) I think Tom Lanes suggestion of partial indexes is what I need to look into. > BTW, you shouldn't use 'banana', 'pear', etc as the data_lookup_key, > you should make another table like this: ... And then replace the > data_lookup_key col with a column of integers that is a foreign > key to the names table - waaaaay faster to process. Gotcha, yeah, I'm targeting as close to 3NF as I get get. Was just trying to be generic for my example ... bad example, oops. Dante
В списке pgsql-performance по дате отправления: