Re: Website mailing list search enhancement idea - one result per thread
От | Masahiro Ikeda |
---|---|
Тема | Re: Website mailing list search enhancement idea - one result per thread |
Дата | |
Msg-id | 55df24092b8c1ffa44f382b6f3a681df@oss.nttdata.com обсуждение исходный текст |
Ответ на | Re: Website mailing list search enhancement idea - one result per thread (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Website mailing list search enhancement idea - one result per thread
(Masahiro Ikeda <ikedamsh@oss.nttdata.com>)
|
Список | pgsql-www |
"David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes: > When searching the mailing lists in advanced search mode we have list > filtering, timeframe filtering (pretty coarse though) and then sorting > mode. For moderately long running threads a search term is going to > return > many messages from the same thread. It would desirable to ask that > only > threads be shown and maybe how many individual messages in that thread > matched. +1. I always think so. On 2020-05-01 01:55, Tom Lane wrote: > Magnus Hagander <magnus@hagander.net> writes: >> On Thu, Apr 30, 2020 at 2:17 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> "David G. Johnston" <david.g.johnston@gmail.com> writes: >>>> It would desirable to ask that only >>>> threads be shown and maybe how many individual messages in that >>>> thread >>>> matched. > >>> +1, an option to do that would be handy. No idea how hard it is ... > >> Not having actually looked at it, probably not too hard. The hardest >> part >> might be bikeshedding about the color (aka user interface). > >> Should this replace the current interface, or do we want to make >> something >> like this a choice? > > No no, not replace. It should be an option, "aggregate into threads" > or something like that. Why don't you implement the following feature? (1) add an option on the advanced search page. I agree with the idea to add the option to aggregate into threads on the advanced search page. I attached the image("add_option_to_aggregate_into_threads.png"). What do you think? (2) change the message's URLs on the result pages if the option is checked. I think it's better to change the message's URLs to flatted ones on result pages if the messages are aggregated into threads because users want to access the discussion pages not per messages but per threads directly. ``` --- a/templates/search/listsearch.html +++ b/templates/search/listsearch.html @@ -53,7 +59,12 @@ <h2>Results {{firsthit}}-{{lasthit}} of {%if hitcount == 1000%}more than 1000{%else%}{{hitcount}}{%endif%}.</h2> {%if pagelinks %}Result pages: {{pagelinks|safe}}<br/><br/>{%endif%} {%for hit in hits %} - {{forloop.counter0|add:firsthit}}. <a href="https://www.postgresql.org/message-id/{{hit.messageid}}">{{hit.subject}}</a> [{{hit.rank|floatformat:2}}]<br/> + {{forloop.counter0|add:firsthit}}. + {%if aggregate%} + <a href="https://www.postgresql.org/message-id/flat/{{hit.messageid}}">{{hit.subject}}</a> [{{hit.rank|floatformat:2}}]<br/> + {%else%} + <a href="https://www.postgresql.org/message-id/{{hit.messageid}}">{{hit.subject}}</a> [{{hit.rank|floatformat:2}}]<br/> + {%endif%} From {{hit.author}} on {{hit.date}}.<br/> {{hit.abstract|safe}}<br/> <a href="https://www.postgresql.org/message-id/{{hit.messageid}}">https://www.postgresql.org/message-id/{{hit.messageid}}</a><br/> ``` If someone already works for this, it's ok to ignore the following. Although I'm not familiar with Django, I made the patches for pgarchives and pgweb. - pgweb: "0001-wip-add-an-search-option-to-aggregate-into-threads.patch" This patch changes the user interface the above (1) and (2). - pgarchives: "0001-wip-add-an-search-parameter-to-aggregate-into-thread.patch" This patch changes the search query to the Postgresql. The example query is the following. To aggregate into threads, it uses the window function to extract the highest-ranked messages in the same thread group. ``` SELECT messageid, date, subject, _from, ts_rank_cd(fti, plainto_tsquery('public.pg', 'wal writer')), ts_headline(bodytxt, plainto_tsquery('public.pg', 'wal writer'),'StartSel="[[[[[[",StopSel="]]]]]]"') FROM (SELECT messageid, date, subject, _from, fti, bodytxt, -- add a sub query RANK() OVER (PARTITION BY threadid ORDER BY ts_rank_cd(fti, plainto_tsquery('wal writer')) DESC, id) AS rank -- this line is main change FROM messages m WHERE fti @@ plainto_tsquery('public.pg', 'wal writer') AND EXISTS (SELECT 1 FROM list_threads lt WHERE lt.threadid=m.threadid AND lt.listid=ANY(ARRAY[2])) AND m.date > '2020-02-07T08:44:56.738979'::timestamp ) m1 WHERE m1.rank = 1 ORDER BY ts_rank_cd(fti, plainto_tsquery('wal writer')) DESC LIMIT 1000; ``` If you have any good ideas, please let me know. Regards, -- Masahiro Ikeda NTT DATA CORPORATION
Вложения
В списке pgsql-www по дате отправления: