pg_bulkload ON_DUPLICATE_MERGE
От | Benjamin Johnson |
---|---|
Тема | pg_bulkload ON_DUPLICATE_MERGE |
Дата | |
Msg-id | 4F205CBE.4000307@getcarbonblack.com обсуждение исходный текст |
Ответы |
Re: pg_bulkload ON_DUPLICATE_MERGE
|
Список | pgsql-hackers |
PG Gurus, I have a table like this: CREATE TABLE filemods ( guid BIGINT NOT NULL UNIQUE, filepath_guid BIGINT NOT NULL, createtime TIMESTAMP WITH TIME ZONE DEFAULT NULL, writetime TIMESTAMP WITH TIME ZONE DEFAULT NULL, deletetime TIMESTAMP WITH TIME ZONE DEFAULT NULL, ); One "event" might have (1, '2012-01-25 11:00:00', NULL, NULL) and another event might have (1, NULL, '2012-01-25 11:05:00', NULL) and the end result should be: (1, '2012-01-25 11:00:00', '2012-01-25 11:05:00', NULL). I'm trying to modify pg_bulkload to "merge" two rows together. The changes I have made seem to be working, although I would like input on what I am doing that is unsafe or terribly wrong. You can be brutal. I've seen incredible write speed with using pg_bulkload. If I can just get it to "consolidate" our rows based on the unique key it will remove a lot of complexity in our software. Also, I'm not entirely sure this mailing list is the correct one, but with all the internals you all know, I'm hoping you can help point out nasty flaws in my algorithm. This is the first legitimate attempt I have made at modifying PG source, so I'm not real familiar with the proper way of loading pages and tuples and updating heaps and all that pg core stuff. Here's the modifications to pg_btree.c (from pg_bulkload HEAD): http://pastebin.com/U23CapvR I also attached the patch. Thank you!! Ben -- Benjamin Johnson http://getcarbonblack.com/ | @getcarbonblack cell: 312.933.3612
Вложения
В списке pgsql-hackers по дате отправления: