On Sat, Apr 27, 2024 at 07:19:21AM GMT, Eric Wong wrote:
Correct, public-inbox currently won't index every header due to cost, false positives, and otherwise lack of usefulness (general gibberish from DKIM sigs, various UUIDs, etc).
So it doesn't currently know about "X-stable:"
I started working on making headers indexing configurable last year, but didn't hear a response from the person that potentially was interested:
https://public-inbox.org/meta/20231120032132.M610564@dcvr/
Right now, indexing new headers + validations can be maintained as a Perl module in the public-inbox codebase.
For lore, it'd make sense to be able to configure a bunch (or all) inboxes at once instead of the per-inbox configuration in my proposed RFC.
At minimum, one would have to know:
- the mail header name (e.g. `X-stable')
- the search prefix to use (e.g. `xstable:') # can't use dash `-' AFAIK
- the type of header value (phrase, string, sortable numeric, etc...)
I'm whole-heartedly for this! This ties nicely to my b4 work where I'd like to be able to identify code-review trailers sent for a specific patch, even if that patch itself is not on lore. For example, this could be a patch that is part of a pull-request on a git forge, but we'd still like to be able to collect and find code-review trailers for it when a maintainer applies it.
Currently, I am using the following approach:
| Reviewed-by: Some Developer some.dev@example.org | --- | for-patch-id: abcd...1234
Then I can query 'nq:"for-patch-id: abcd...1234"', but this is probably much more heavy than if I could provide this in a custom header:
| X-For-Patch-ID: abcd...1234
and query for "xforpatchid:abcd...1234"
I'm trying to avoid supporting sortable numeric values for this, since supporting them will problems if columns get repurposed with admins changing their minds. A full reindex would fix it, but those are crazy expensive.
I'm perfectly fine with it only being a string, honestly.
So probably just supporting strings and/or phrases to start...
Validation to prevent poisoning by malicious/broken senders can be useful in some cases (and the reason the RFC was a per use case Perl module). That said, I'm not sure if much validation is necessary for X-stable: headers or if just any text is fine.
I'd let the consumer clients worry about it.
-K