Re: performance of IN (subquery)
От | Greg Stark |
---|---|
Тема | Re: performance of IN (subquery) |
Дата | |
Msg-id | 87r7psmlnp.fsf@stark.xeocode.com обсуждение исходный текст |
Ответ на | performance of IN (subquery) (Kevin Murphy <murphy@genome.chop.edu>) |
Ответы |
Re: performance of IN (subquery)
|
Список | pgsql-general |
Tom Lane <tgl@sss.pgh.pa.us> writes: > Greg Stark <gsstark@mit.edu> writes: > > I'm not about to run analyze in the middle of the data generation > > (which wouldn't work anyways since it's in a transaction). > > Since 7.3 or 7.4, you *can* run ANALYZE in the middle of a transaction. > The cached-plan business is a problem, I agree, but I think it's > orthogonal to this particular discussion (and you can always use EXECUTE > if you have to). It's orthogonal. My point was that I have a bigger problem, but even if I address it by switching away from plpgsql, or I guess by using EXECUTE, I would still have a problem. I didn't realize you could run analyze in a transaction, but even being able to I wouldn't really want to have to do that repeatedly during the job. This new approach would actually complete the fix, a perl or plpgsql EXECUTE implementation would gradually shift statistics during the job. Except that the first thing the job does is delete all the old records. This is inside a transaction. So an estimate based on the heap size would be off by a factor of two by the time the job is done. > but separate ANALYZE could definitely make an estimate of the fraction of > dead tuples. With analyze in a transaction I'm not clear what the semantics should be though. I suppose it should only count tuples visible to the transaction analyze? > Nope, you aren't. The above seems to me to be a recipe for degradation > of performance over time, precisely because the plans wouldn't change in > the face of changes in the situation. A gradual degradation is ok. A gradual degradation means I can schedule a nightly analyze and report on any changed plans and either automatically accept them or manually approve them individually. A sudden degradation is much more dangerous. Even if it's rare, a sudden degradation means an outage in prime time. As I said, it doesn't matter to me if every query is 10% slower than possible, as long as no query takes 1000% as long as necessary even if it's a 1 in 1000 occurrence. > I've resisted adding "planner hints" to the language for this reason, and > I'm certainly not eager to offer any hard guarantees about plans not > changing. I just want to control _when_ they change. Eventually you'll come around. I think it'll be a slow gradual change in thinking as the user-base changes though. Not something I'll change with a single argument in one day. -- greg
В списке pgsql-general по дате отправления: