abramdemski comments on Where to Draw the Boundaries?

abramdemski Feb 15, 2021, 6:41 PM
6 points
Because… what could you possibly be grounding this all out in, other than consequences?
In my mind, it stands as an open problem whether you can “usually” expect an intelligent system to remain “agent-like in design” under powerful self-modification. By “agent-like in design” I mean having subcomponents which transparently contribute to the overall agentiness, such as true beliefs, coherent goal systems, etc.
The argument in favor is: it becomes really difficult to self-optimize as your own mind-design becomes less modular. At some point you’re just a massive policy with each part fine-tuned to best shape the future (a future which you had some model of at some point in the past); at some point you have to lose general-purpose learning. Therefore, agents with complicated environments and long time horizons will stay modular.
The argument against is: it just isn’t very probable that the nice clean design is the most optimal. Even if there’s only a small incentive to do weird screwy things with your head (ie a small chance you encounter Newcomblike problems where Omega cares about aspects of your ritual of cognition, rather than just output), the agent will follow that incentive where it leads. Plus, general self-optimization can lead to weird, non-modular designs. Why shouldn’t it?
So, in my mind, it stands as an open problem whether purely consequentialist arguments tend to favor a separate epistemic module “in the long term”.
Therefore, I don’t think we can always ground pure epistemic talk in consequences. At least, not without further work.
However, I do think it’s a coherent flag to rally around, and I do think it’s an important goal in the short term, and I think it’s particularly important for a large number of agents trying to coordinate, and it’s also possible that it’s something approaching a terminal goal for humans (ie, curiosity wants to be satisfied by truth).
So I do want to defend pure epistemics as its own goal which doesn’t continuously answer to broad consequentialism. I perceive some reactions to Zack’s post as isolated demands for rigor, invoking the entire justificatory chain to consequentialism when it would not be similarly invoked for a post about, say, p-values.
(A post about p-values vs bayesian hypothesis testing might give rise do discussions of consequences, but not questions of whether the whole argument about bayes vs p-values makes sense because isn’t epistemics ultimately consequentialist anyway or similar.)
I would agree with the claim “if you’re constantly checking ‘hey, in this particular instance, maybe it’s net positive to lie?’ you end lying all the time, and end up in a world where people can’t trust each other”, so it’s worth treating appeals to consequences as forbidden as part of a Rule Consequentialism framework. But, why not just say that?
I would respond:
1. Partly for the same reason that a post on Bayes’ Law vs p-values wouldn’t usually bother to say that; it’s at least one meta level up from the chief concerns. Granted, unlike a hypothetical post about p-values, Zack’s post was about the appeal-to-consequences argument from its inception, since it responds to an inappropriate appeal to consequences. However, Zack’s primary argument is on the object level, pointing out that how you define words is of epistemic import, and therefore cannot be chosen freely without making epistemic compromises.
2. TAG and perhaps other critics of this post are not conceding that much; so, the point you make doesn’t seem like it’s sufficient to address the meta-level questions which are being raised.
I would concede that there is, perhaps, something funny about the way I’ve been responding to the discussion—I have a sense that I might be doing some motte/bailey thing around (motte:) this is an isolated demand for rigor, and we should be able to talk about pure epistemics as a goal without explicitly qualifying everything with “if you’re after pure epistemics”; vs (bailey:) we should pursue pure epistemics. In writing comments here, I’ve attempted to carefully argue the two separately. However, I perceive TAG as not having received these as separate arguments. And it is quite possible I’ve blurred the lines at times. They are pretty relevant to each other.
What links here?
- Vladimir_Nesov's comment on Which values are stable under ontology shifts? by Richard_Ngo (Jul 23, 2022, 7:29 PM; 6 points)