This is going the wrong direction. If privacy from admins is important (I argue that it’s not for LW messages, but that’s a separate discussion), then breaches of privacy should be exceptions for specific purposes, not allowed unless “really secret contents”.
Don’t make this filter-in for privacy. Make it filter-out—if it’s detected as likely-spam, THEN take more intrusive measures. Privacy-preserving measures include quarantining or asking a few recipients if they consider it harmful before delevering (or not) the rest, automated content filters, etc. This infrastructure requires a fair bit of data-handling work to get it right, and a mitigation process where a sender can find out they’re blocked and explicitly ask the moderator(s) to allow it.
The reason I suggest making it filter-in is because it seems to me that it’s easier to make a meaningful filter that accurately detects a lot of sensitive stuff than a filter that accurately detects spam, because “spam” is kind of open-ended. Or I guess in practice spam tends to be porn bots and crypto scams? (Even on LessWrong?!) But e.g. truly sensitive talk seems disproportionately likely to involve cryptography and/or sexuality, so trying to filter for porn bots and crypto scams seems relatively likely to have reveal sensitive stuff.
The filter-in vs filter-out in my proposal is not so much about the degree of visibility. Like you could guard my filter-out proposal with the other filter-in proposals, like to only show metadata and only inspect suspected spammers, rather than making it available for everyone.
This is going the wrong direction. If privacy from admins is important (I argue that it’s not for LW messages, but that’s a separate discussion), then breaches of privacy should be exceptions for specific purposes, not allowed unless “really secret contents”.
Don’t make this filter-in for privacy. Make it filter-out—if it’s detected as likely-spam, THEN take more intrusive measures. Privacy-preserving measures include quarantining or asking a few recipients if they consider it harmful before delevering (or not) the rest, automated content filters, etc. This infrastructure requires a fair bit of data-handling work to get it right, and a mitigation process where a sender can find out they’re blocked and explicitly ask the moderator(s) to allow it.
The reason I suggest making it filter-in is because it seems to me that it’s easier to make a meaningful filter that accurately detects a lot of sensitive stuff than a filter that accurately detects spam, because “spam” is kind of open-ended. Or I guess in practice spam tends to be porn bots and crypto scams? (Even on LessWrong?!) But e.g. truly sensitive talk seems disproportionately likely to involve cryptography and/or sexuality, so trying to filter for porn bots and crypto scams seems relatively likely to have reveal sensitive stuff.
The filter-in vs filter-out in my proposal is not so much about the degree of visibility. Like you could guard my filter-out proposal with the other filter-in proposals, like to only show metadata and only inspect suspected spammers, rather than making it available for everyone.