Re: EXTERNAL storage and substring on long strings
От | Tom Lane |
---|---|
Тема | Re: EXTERNAL storage and substring on long strings |
Дата | |
Msg-id | 18305.1059680697@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | EXTERNAL storage and substring on long strings (Scott Cain <cain@cshl.org>) |
Список | pgsql-performance |
Scott Cain <cain@cshl.org> writes: > explain analyze select substring(residues from 1000000 for 20000) > from feature where feature_id=1; > where feature is a table with ~3 million rows, and residues is a text > column, where for the majority of the rows of feature, it is null, for a > large minority, it is shortish strings (a few thousand characters), and > for 6 rows, residues contains very long strings (~20 million characters > (it's chromosome DNA sequence from fruit flies)). I think the reason uncompressed storage loses here is that the runtime is dominated by the shortish strings, and you have to do more I/O to get at those if they're uncompressed, negating any advantage from not having to fetch all of the longish strings. Or it could be that there's a bug preventing John Gray's substring-slice optimization from getting used. The only good way to tell that I can think of is to rebuild PG with profiling enabled and try to profile the execution both ways. Are you up for that? (BTW, if you are using a multibyte database encoding, then that's your problem right there --- the optimization is practically useless unless character and byte indexes are the same.) regards, tom lane
В списке pgsql-performance по дате отправления: