Future outputs will at the very least include an accompanying paper-overview-in-a-post, and in general a stronger focus on self-contained papers. I see the booklet as a preliminary, highly exploratory bit of work that focused more on the conceptual and theoretical rather than the applied, a goal for which I think it was very suitable (e.g. introducing an epistemological theory with direct applications to alignment).
Sounds good. I enjoyed at least 50% of the time I spent reading the epistemology :P I just wanted a go-to resource for specific technical questions.
Could I send you a DM with it?
Sure, but no promises on interesting feedback.
Connection between winning an argument and finding the truth continues to seem plenty breakable both in humans and in AIs.
Is it because of obfuscated arguments and deception, or some other fundamental issue that you find it so?
Deception’s not quite the right concept. More like exploitation of biases and other weaknesses. This can look like deception, or it can look like incentivizing an AI to “honestly” be searching for arguments in a way that just so happens to be shaped by the argument-evaluation process’ standards other than truth.
Sounds good. I enjoyed at least 50% of the time I spent reading the epistemology :P I just wanted a go-to resource for specific technical questions.
Sure, but no promises on interesting feedback.
Deception’s not quite the right concept. More like exploitation of biases and other weaknesses. This can look like deception, or it can look like incentivizing an AI to “honestly” be searching for arguments in a way that just so happens to be shaped by the argument-evaluation process’ standards other than truth.