Header unfolding in archived mail
От | Noah Misch |
---|---|
Тема | Header unfolding in archived mail |
Дата | |
Msg-id | 20130907220745.GA188338@tornado.leadboat.com обсуждение исходный текст |
Ответы |
Re: Header unfolding in archived mail
|
Список | pgsql-www |
The mailing list web archives display the subject of message 20130603190727.GA360354@tornado.leadboat.com as follows: Partitioning performance: cache stringToNode() ofpg_constraint.ccbin Note the lack of whitespace after "of". The original message, which you can see by downloading the mbox for June 2013, conveyed the subject this way: Subject: Partitioning performance: cache stringToNode() of pg_constraint.ccbin Per RFC 5322, section 2.2.3: The process of moving from this folded multiple-line representation of a header field to its single line representation is called "unfolding". Unfolding is accomplished by simply removing any CRLF that is immediately followed by WSP. Each header field should be treated in its unfolded form for further syntactic and semantic evaluation. An unfolded header field has no length restriction and therefore may be indeterminately long. So, the archives should present the subject like this: Partitioning performance: cache stringToNode() of pg_constraint.ccbin Gmane and osdir.com do so. MARC and Gmail show a space in place of the tab, but Gmail converts every subject-line tab to a space. I have attached a patch, against pgarchives.git, making its unfolding code conform to RFC 5322. The change also affects headers folded before a space rather than before a tab, such as 50E31370.5030405@cybertec.at. Those have been displaying fine despite the lack of unfolding because newline-space renders like a space in HTML. I unit-tested the change, but I did not test the full archives load. The "raw" message display feature seems to have its own set of rules, and I failed to find their implementation. Here are the subject lines for the aforementioned messages according to "raw" display: Subject: Partitioning performance: cache stringToNode() of pg_constraint.ccbin Subject: Review of "pg_basebackup and pg_receivexlog to use non-blocking socket communication", was: Re: Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown In one case, "\n\t" from the true raw original (in the mbox file) became " ". In the other case, two instances of "\n " became "\n\t". Any ideas where that transformation is coming from? Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Вложения
В списке pgsql-www по дате отправления: