Cool! I like this as an example of difficult-to-notice problems are generally left unsolved. I’m not sure how serious this one is, though.
Antoine de Scorraille
And it feels like becoming a winner means consistently winning.
Reminds me strongly of the difficulty of accepting commitment strategies in decision theories as in Parfit’s Hitchhiker: one gets the impression that a rational agent win-oriented should win in all situations (being greedy); but in reality, this is not always what winning looks like (optimal policy rather than optimal actions).For it conjures obstacles that were never there.
Let’s try apply this to a more confused topic. Risky. Recently I’ve slighty updated away from the mesa paradigm, reading the following in Reward is not the optimization target:
Stop worrying about finding “outer objectives” which are safe to maximize.[9] I think that you’re not going to get an outer-objective-maximizer (i.e. an agent which maximizes the explicitly specified reward function).
Instead, focus on building good cognition within the agent.
In my ontology, there’s only one question: How do we grow good cognition inside of the trained agent?
How does this relate to the goal/path confusion? Alignment paths strategies:
Outer + Inner alignment aims to be an alignment strategy, but only the straight path one. Any homotopic alignment path could be safe as well, our only real concern.
Surprisingly, this echoes to me a recent thought: “frames blind us”; it comes to mind when I think of how traditional agent models (RL, game theory...) blinded us for years to their implicit assumption of no embeddedness. As with non-Euclidean shift, the change has come from relaxing the assumptions.
This post seems to complete your good old Traps of Formalization. Maybe it’s a hegelian dialectic:
I) intuition domination (system 1)
II) formalization domination (system 2)
III) modulable formalization (improved system 1/cautious system 2)
Deconfusion is reached only when one is able to go from I to III.
In my view you misunderstood JW’s ideas, indeed. His expression “far away relevant”/”distance” is not limited to spatial or even time-spatial distance. It’s a general notion of distance which is not fully formalized (work’s not done yet).
We have indeed concerns about inner properties (like your examples), and it’s something JW is fully aware. So (relevant) inner structures could be framed as relevant “far away” with the right formulation.
Find to maximize the predictive accuracy on the observed data, , where . Call the result .
Isn’t the z in the sum on the left a typo? I think it should be n
Is the adversarial perturbation not, in itself, a mis-specification? If not, I would be glad to have your intuitive explanation of it.
Funny meta: I’m reading this just after finishing your two sequences about Abstraction, which I find very exciting! But surprise, your plan changes ! Did I read all that for nothing? Fortunately, I think it’s mostly robust, indeed :)
The difference (here) between “Heuristic” and “Cached-Solutions” seems to me analogous to the difference between lazy evaluation and memoization:
Lazy evaluation ~ Heuristic: aims to guide the evaluation/search by reducing its space.
Memoization ~ Cached Solutions: stores in memory the values/solutions already discovered to speed up the calculation.
[Question] Choice := Anthropics uncertainty? And potential implications for agency
Ya I’ll be there so I’d be glad to see you, especially Adam!
We are located in London.
Great! Is there a co-working space or something? If so, where? Also, are you planning to attend EAG London as a team?
Agreed. It’s the opposite assumption (aka no embeddedness) for which I wrote this; fixed.