Re: Megabytes of stats saved after every connection
От | Phil Endecott |
---|---|
Тема | Re: Megabytes of stats saved after every connection |
Дата | |
Msg-id | 42EA8049.2020904@chezphil.org обсуждение исходный текст |
Ответ на | Re: Megabytes of stats saved after every connection (Greg Stark <gsstark@mit.edu>) |
Ответы |
Re: Megabytes of stats saved after every connection
|
Список | pgsql-general |
Greg Stark wrote: > Phil Endecott wrote: >>Just to give a bit of background, in case it is useful: this is my family tree >>website, treefic.com. I have a schema for each user, each with about a dozen >>tables. In most cases the tables are small, i.e. tens of entries, but the >>users I care about are the ones with tens of thousands of people in their >>trees. The schemas are independent of each other. Example web page: > > I would strongly suggest you reconsider this design altogether. A normal (and > normalized) design would have a users table that assigns a sequential id to > each user. Then every other table would combine everybody's data but have a > user id column to indicate which user that row belonged to. > > If you don't believe there's anything wrong with your current system, consider > what it would look like to query your existing schema to find out the answer > to the question "how many users have > 1000 people in their tree". Or "how > many users have updated their tree in the last 7 days". Those aren't questions that I need to answer often. The sort of question I do need to answer is this: starting from individual X, find all the ancestors and descendants for n generations. This involves n iterations of a loop, joining the relatives found so far with the next generation. If there are p people in the tree this has something like O(n log p) complexity. On the other hand, if I stored all users' data in the same tables and I had u users, this operation would have O(n log (u*p)) complexity. My guess is that it would be about an order of magnitude slower. The individual users' sites are entirely disjoint - there are no queries that overlap them. --Phil.
В списке pgsql-general по дате отправления: