Re: Approach to Data Summary and Analysis
От | Rob Sargent |
---|---|
Тема | Re: Approach to Data Summary and Analysis |
Дата | |
Msg-id | 534D5800.1060400@gmail.com обсуждение исходный текст |
Ответ на | Re: Approach to Data Summary and Analysis (Robert DiFalco <robert.difalco@gmail.com>) |
Список | pgsql-general |
On 04/15/2014 09:53 AM, Robert DiFalco wrote: > Actually that was exactly the initial table design. There were more > fields because for my use case there were a lot more states and > certain states have additional data (for example when a call goes from > answered to connected it also gets the user_id of the person being > connected to). So that one table started getting a LOT of columns > which starting making it hard to reason about. > > The more normalized version has a couple of things going for it. > COUNT, MIN, MAX, etc are very fast because I don't have to > conditionally add null checks. Everything is inserted so for the > millions of calls that get made the normalized schema was much more > efficient for writing. It was also easier to understand. The answer > table only has calls that were answered, the error table only has > calls the resulted in an error after being connected, etc. > > I know this kind of gets into a religious area when discussing NULLs > and what level of normalization is appropriate so I don't want to > spark any of that on this thread. But only doing inserts and never > doing updates or deletes performed very well for large data sets. > > That said, I could explore a compromise between the monolithic table > approach and the completely normalized set of tables approach. Thanks > for your input! > I wonder if the "LOT of columns" are the bits that need to be parcelled off as specific to one condition of a call?
В списке pgsql-general по дате отправления: