Re: Deletion Challenge
От | Berend Tober |
---|---|
Тема | Re: Deletion Challenge |
Дата | |
Msg-id | 5667E296.6050005@computer.org обсуждение исходный текст |
Ответ на | Re: Deletion Challenge (Steve Crawford <scrawford@pinpointresearch.com>) |
Список | pgsql-general |
Steve Crawford wrote: > If I understand correctly the value of "click" always advances and within a "click" the > "cash_journal_id" always advances - not necessarily by single steps so within a fairian_id, ordering > by "click" plus "cash_journal_id" would return the records in order from which you want the most > recent 5 for each farian_id. > > Typing without testing and ignoring performance optimizations, something along the lines of the > following should work and covers the "last 5" issue as well. > > with stuff_to_delete as ( > select farian_id, click, cash_journal_id, > rank() over (partition by farian_id order by (click, cash_journal_id) desc) as howold) > from cash_journal) > delete from cash_journal > using stuff_to_delete > where > cash_journal.farian_id = stuff_to_delete.farian_id > and cash_journal.click = stuff_to_delete.click > and cash_journal.cash_journal_id = stuff_to_delete.cash_journal_id > and stuff_to_delete.howold > 5; > Assessing without testing, I like that. Thanks! Although the above is not the exactly the form I was using, an earlier iteration of a related problem employed window functions. But as the data set grew performance suffered, so if deletes were not done on a regular, continuing basis in order to keep the historical data set approximately "small", the process execution time using a windowing scheme eventually exceeded the extent of my patience. That "non-scalable" situation is actually what motivated the deliberate de-normalization (of retaining the "running balance" in a separate column) and the desire to delete old data. The original implementation calculated the running balance on-the-fly, employing windowing per fairian_id, and those tallies of the net balance entailed increasingly lengthy execution times as the number of rows increased, hence I was motivated to retain only a relatively constant-sized per-farian history, and I dismissed the use of windowing for the delete problem since it was so problematic for the running-balance-without-delete problem. Thanks for knocking some sense into me!
В списке pgsql-general по дате отправления: