Re: Making a tree with "millions and millions" of dynamic nodes
От | joe.celko@northface.edu (--CELKO--) |
---|---|
Тема | Re: Making a tree with "millions and millions" of dynamic nodes |
Дата | |
Msg-id | a264e7ea.0312101319.7de8adb7@posting.google.com обсуждение исходный текст |
Ответ на | Making a tree with "millions and millions" of dynamic nodes (Christian Fowler <google@NOSPAM.gravesweeper.com>) |
Список | pgsql-general |
>> The depth should not exceed 6 or 7 levels. The initial import will have about 6 million leaves, and 3 million branches. I would expect the leaves to grow significantly, in number easily tripling. << That is not too bad -- depth would worry me more than breadth >> 1. path generation speed << That is pretty easy in a nested set model: SELECT T1.node, (T1.rgt-T1.lft) AS depth FROM Tree AS T1, Tree AS T2 WHERE T2.lft BETWEEN T1.lft AND T1.rgt AND T2.node = :my_guy ORDER BY depth; The greater (T1.rgt-T1.lft) is, the closer the node is to the root You can also convert depth into a sequential number by using. SELECT T2.node, COUNT (T1.node)AS depth FROM Tree AS T1, Tree AS T2 WHERE T2.lft BETWEEN T1.lft AND T1.rgt GROUP BY T2.part; >> 2. immediate sibling speed << SELECT B.node AS boss, P.node FROM Tree AS P LEFT OUTER JOIN Tree AS B ON B.lft = (SELECT MAX(lft) FROM Tree AS S WHERE P.lft > S.lft ND P.lft <S.rgt); >> 3. children count speed << SELECT :my_guy, (rgt - lft +1)/2 AS subtree_size FROM Tree WHERE node = :my_guy; >> I have implemented a first run using Joe Celko's Nested Sets (w/ a mod to store tree level for speed). The update propigation issue is the achilles heel of this approach for me. << It is not as bad as people first think. The tree structure is in one table and the nodes are in another, so you can manipulating of two integers and one key (perhaps another integer). Since new siblings are added to the right side of the existing line of siblings under the parent, you average a bit less than half of the nodes being scanned in practice. There are some other tricks, like leaving a large gap between (lft) and (rgt) numbers so you have room to insert new nodes. >> So it seems Materialized Path is my only option, however I am concerned about LIKE performance for the right hand side of the tree, where the path is 8 digits x 6 levels = 48 chars. Should I be concerned? << It does not sound too bad, assuming an ordered index. I wonder if you could improve performance of the LIKE predicates by reversing the path string ... >> ... some reassuring words about real-time performance of a 20 million row tree, i'd love to hear << Sorry, the most I have played with is a few hundred thousand rows with the nested set model.
В списке pgsql-general по дате отправления: