Re: Incorrect number of rows inserted into partitioned table
От | Tom Lane |
---|---|
Тема | Re: Incorrect number of rows inserted into partitioned table |
Дата | |
Msg-id | 23169.1681164279@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Incorrect number of rows inserted into partitioned table (Роман Осипов <osipovromanvladimirovich@gmail.com>) |
Список | pgsql-bugs |
=?UTF-8?B?0KDQvtC80LDQvSDQntGB0LjQv9C+0LI=?= <osipovromanvladimirovich@gmail.com> writes: > There is an insertion not of the amount specified in *limit*,, but a little > more or less than it. I think you're getting burnt by the fact that a rule is a macro, combined with the fact that your query is underspecified: insert into book_inherit_test select * from book_test limit 1000000; This doesn't constrain *which* 1000000 rows of book_test get inserted. If we EXPLAIN it we get: =# explain insert into book_inherit_test select * from book_test limit 1000000; QUERY PLAN ------------------------------------------------------------------------------------------------------ Insert on book_inherit_test (cost=0.00..32353.00 rows=0 width=0) -> Subquery Scan on "*SELECT*" (cost=0.00..32353.00 rows=617176 width=31) Filter: ((("*SELECT*".releaseyear >= 1900) AND ("*SELECT*".releaseyear < 1951)) IS NOT TRUE) -> Limit (cost=0.00..17353.00 rows=1000000 width=31) -> Seq Scan on book_test (cost=0.00..34706.00 rows=2000000 width=31) Insert on book_inherit_test_before_1950 (cost=0.00..32353.00 rows=0 width=0) -> Subquery Scan on "*SELECT*" (cost=0.00..32353.00 rows=382824 width=31) Filter: (("*SELECT*".releaseyear >= 1900) AND ("*SELECT*".releaseyear < 1951)) -> Limit (cost=0.00..17353.00 rows=1000000 width=31) -> Seq Scan on book_test (cost=0.00..34706.00 rows=2000000 width=31) (11 rows) from which we can see that the insertions into book_inherit_test and those into book_inherit_test_before_1950 are driven off completely separate scans of book_test. And that table is big enough that the synchronize_seqscans feature kicks in, meaning that indeed this will scan two different million-row subsets of book_test, producing results fundamentally unlike what you expected. I get stable results after disabling synchronize_seqscans, but a more correct way would be to add ORDER BY to fully determine which rows of book_test are considered. Advice: don't use a RULE for this sort of thing. If you really want to, you can get the same effects more predictably with a trigger. But as David said, the whole thing looks like a poor reimplementation of partitioning. regards, tom lane
В списке pgsql-bugs по дате отправления: