Re: Postgres as key/value store
От | Jonathan Vanasco |
---|---|
Тема | Re: Postgres as key/value store |
Дата | |
Msg-id | 47771C82-4FC1-4FDC-834E-E6A6AE2390B2@2xlp.com обсуждение исходный текст |
Ответ на | Postgres as key/value store (snacktime <snacktime@gmail.com>) |
Список | pgsql-general |
On Sep 27, 2014, at 7:48 PM, snacktime wrote: > The schema is that a key is a string, and the value is a string or binary. I am actually storing protocol buffer messages,but the library gives me the ability to serialize to native protobuf or to json. Json is useful at times especiallyfor debugging. I don't know if this will apply to you, but i received significant speed improvements on Postgres key searches by using substringindexes. If your keys are just random hashes, this would probably work well for you. if your keys are person-readable, it's probably not going to work as well as the distribution of prefix characters will probablybe too uniform. But the general idea is twofold: 1. create an additional partial index on the key field -- CREATE INDEX _entities_id__subst_7 ON entities(substr(id,1,7)); 2. update your SELECTS to search for both the full string AND the substring - WHERE id = :id + WHERE (id = :id) AND (substr(id,1,7) = substr(:id, 1, 7)) By adding in the substring query, the planner will (usually) optimize the select by doing a first pass on the substring index. then it searches that limited set for the rest of matching criteria. on a table with 4MM+ records , introducing a substring index/query improved my searches by a few orders of magnitude. before trying this indexing strategy, we were actively looking to migrate this particular query service off of postgres --it was such a bottleneck and was not scalable. now there is no reason to leave in the foreseeable future. On Sep 27, 2014, at 8:33 PM, Gavin Flower wrote: >> This works well because keys are left prefixed with a scope, a delimiter, and then the actual key for the data. > Then I noticed that your id is actually a compound key, and probably would be better modelled as: if you're able to standardize the scope out, an index of "(scope, substring(key,1,7))" might work well. i only used 1,7 as my key arguments, because that was an optimal speed/space mix on my dataset. depending on yours, a shorteror longer index might be more appropriate
В списке pgsql-general по дате отправления: