I too thought the one cruise I’ve been on was a pretty good type of holiday! A giant moving building full of nice things is so much more convenient a vehicle than the usual series of planes and cabs and subways and hauling bags along the road and stationary buildings etc.
KatjaGrace
Typology of blog posts that don’t always add anything clear and insightful
Do incoherent entities have stronger reason to become more coherent than less?
Holidaying and purpose
I wrote an AI Impacts page summary of the situation as I understand it. If anyone feels like looking, I’m interested in corrections/suggestions (either here or in the AI Impacts feedback box).
A few quick thoughts on reasons for confusion:
I think maybe one thing going on is that I already took the coherence arguments to apply only in getting you from weakly having goals to strongly having goals, so since you were arguing against their applicability, I thought you were talking about the step from weaker to stronger goal direction. (I’m not sure what arguments people use to get from 1 to 2 though, so maybe you are right that it is also something to do with coherence, at least implicitly.)
It also seems natural to think of ‘weakly has goals’ as something other than ‘goal directed’, and ‘goal directed’ as referring only to ‘strongly has goals’, so that ‘coherence arguments do not imply goal directed behavior’ (in combination with expecting coherence arguments to be in the weak->strong part of the argument) sounds like ‘coherence arguments do not get you from ‘weakly has goals’ to ‘strongly has goals’.
I also think separating out the step from no goal direction to weak, and weak to strong might be helpful in clarity. It sounded to me like you were considering an argument from ‘any kind of agent’ to ‘strong goal directed’ and finding it lacking, and I was like ‘but any kind of agent includes a mix of those that this force will work on, and those it won’t, so shouldn’t it be a partial/probabilistic move toward goal direction?’ Whereas you were just meaning to talk about what fraction of existing things are weakly goal directed.
Thanks. Let me check if I understand you correctly:
You think I take the original argument to be arguing from ‘has goals’ to ‘has goals’, essentially, and agree that that holds, but don’t find it very interesting/relevant.
What you disagree with is an argument from ‘anything smart’ to ‘has goals’, which seems to be what is needed for the AI risk argument to apply to any superintelligent agent.
Is that right?
If so, I think it’s helpful to distinguish between ‘weakly has goals’ and ‘strongly has goals’:
Weakly has goals: ‘has some sort of drive toward something, at least sometimes’ (e.g. aspects of outcomes are taken into account in decisions in some way)
Strongly has goals: ’pursues outcomes consistently and effectively’ (i.e. decisions maximize expected utility)
So that the full argument I currently take you to be responding to is closer to:
By hypothesis, we will have superintelligent machines
They will weakly have goals (for various reasons, e.g. they will do something, and maybe that means ‘weakly having goals’ in the relevant way? Probably other arguments go in here.)
Anything that weakly has goals has reason to reform to become an EU maximizer, i.e. to strongly have goals
Therefore we will have superintelligent machines that strongly have goals
In that case, my current understanding is that you are disagreeing with 2, and that you agree that if 2 holds in some case, then the argument goes through. That is, creatures that are weakly goal directed are liable to become strongly goal directed. (e.g. an agent that twitches because it has various flickering and potentially conflicting urges toward different outcomes is liable to become an agent that more systematically seeks to bring about some such outcomes) Does that sound right?
If so, I think we agree. (In my intuition I characterize the situation as ‘there is roughly a gradient of goal directedness, and a force pulling less goal directed things into being more goal directed. This force probably doesn’t exist out at the zero goal directness edges, but it unclear how strong it is in the rest of the space—i.e. whether it becomes substantial as soon as you move out from zero goal directedness, or is weak until you are in a few specific places right next to ‘maximally goal directed’.)
- Mar 30, 2021, 3:01 PM; 34 points) 's comment on Coherence arguments imply a force for goal-directed behavior by (
Coherence arguments imply a force for goal-directed behavior
Good points. Though I claim that I do hold the same facial expression for long periods sometimes, if that’s what you mean by ‘not moving’. In particular, sometimes it is very hard for me not to screw up my face in a kind of disgusted frown, especially if it is morning. And sometimes I grin for so long that my face hurts, and I still can’t stop.
Animal faces
Quarantine variety
Why does Applied Divinity Studies think EA hasn’t grown since 2015?
It doesn’t seem that hard to wash your hands after putting away groceries, say. If I recall, I was not imagining getting many touches during such a trip. I’m mostly imagining that you put many of the groceries you purchase in your fridge or eat them within a couple of days, such that they are still fairly contaminated if they started out contaminated, and it is harder to not touch your face whenever you are eating recently acquired or cold food.
Sleep math: red clay blue clay
Arrow grid game
Yes—I like ‘application’ over ‘potentially useful product’ and ‘my more refined writing skills’ over ‘my more honed writing’, in its first one, for instance.
Fwiw I’m not aware of using or understanding ‘outside view’ to mean something other than basically reference class forecasting (or trend extrapolation, which I’d say is the same). In your initial example, it seems like the other person is using it fine—yes, if you had more examples of an AGI takeoff, you could do better reference class forecasting, but their point is that in the absence of any examples of the specific thing, you also lack other non-reference-class-forecasting methods (e.g. a model), and you lack them even more than you lack relevant reference classes. They might be wrong, but it seems like a valid use. I assume you’re right that some people do use the term for other stuff, because they say so in the comments, but is it actually that common?
I don’t follow your critique of doing an intuitively-weighted average of outside view and some inside view. In particular, you say ‘This is not Tetlock’s advice, nor is it the lesson from the forecasting tournaments...‘. But in the blog post section that you point to, you say ‘Tetlock’s advice is to start with the outside view, and then adjust using the inside view.’, which sounds like he is endorsing something very similar, or a superset of the thing you’re citing him as disagreeing with?