Epistemologist specialized in the difficulties of alignment. Currently at Conjecture, and running Refine.
adamShimi(Adam Shimi)
Oh, didn’t know him!
Thanks for the links!
Thanks for the comment!
I agree with you that there are situations where the issue comes from a cultural norm rather than psychological problems. That’s one reason for the last part of this post, where we point out to generally positive and productive norms that try to avoid these cultural problems and make it possible to discuss them. (One of the issue I see in my own life with cultural norms is that they are way harder to discuss when in addition psychological problems compound them and make them feel sore and emotional). But you might be right that it’s worth highlighting more.
In a more meta point, my model is that we have moved from societies where almost everything is considered ″people’s fault” to societies where almost everything is considered “society’s fault”. And it strikes me that this is an overcorrection, and that actually many issues in day to day life and groups are just people’s problem (here drawing from my experience of realizing in many situations that I was the problem, and in other — less common — that the norms were the problem.)
Oh, I definitely agree, this is a really good point. What I was highlighting was an epistemic issue (namely the confusion between ideal and necessary conditions) but there is also a different decision theoretic issue that you highlighted quite well.
It’s completely possible that you’re not powerful enough to work outside the ideal condition. But by doing the epistemic clarification, now we can consider the explicit decision of taking step to become more powerful and being better able to manage non-ideal conditions.
Good point! The difference is that the case explained in this post is one of the most sensible version of confusing the goal and the path, since there the path is actually a really good path. On the other version (like wanting to find a simple theory simply, the path is not even a good one!
In many ways, this post is frustrating to read. It isn’t straigthforward, it needlessly insults people, and it mixes irrelevant details with the key ideas.
And yet, as with many of Eliezer’s post, its key points are right.
What this post does is uncover the main epistemological mistakes made by almost everyone trying their hands at figuring out timelines. Among others, there is:
Taking arbitrary guesses within a set of options that you don’t have enough evidence to separate
Piling on arbitrary assumption on arbitraty assumption, leading to completely uninformative outputs
Comparing biological processes to human engineering in term of speed, without noticing that the optimization path is the key variable (and the big uncertainty)
Forcing the prediction to fit within a massively limited set of distributions, biasing it towards easy to think about distributions rather than representative ones.
Before reading this post I was already dubious of most timeline work, but this crystallized many of my objections and issues with this line of work.
So I got a lot out of this post. And I expect that many people would if they spent the time I took to analyze it in detail. But I don’t expect most people to do so, and so am ambivalent on whether this post should be included in the final selection.
- 20 Jan 2023 20:54 UTC; 8 points) 's comment on The 2021 Review Phase by (
- 27 Jan 2023 23:35 UTC; 2 points) 's comment on Highlights and Prizes from the 2021 Review Phase by (
I was mostly thinking of the efficiency assumption underlying almost all the scenarios. Critch assumes that a significant chunk of the economy always can and does make the most efficient change (everyone replacing the job, automated regulations replacing banks when they can’t move fast enough). Which neglects many potential factors, like big economic actors not having to be efficient for a long time, backlash from customers, and in general all factors making economic actors and market less than efficient.
I expect that most of these factors could be addressed with more work on the scenarios.
I consider this post as one of the most important ever written on issues of timelines and AI doom scenario. Not because it’s perfect (some of its assumptions are unconvincing), but because it highlights a key aspect of AI Risk and the alignment problem which is so easy to miss coming from a rationalist mindset: it doesn’t require an agent to take over the whole world. It is not about agency.
What RAAPs show instead is that even in a purely structural setting, where agency doesn’t matter, these problem still crop up!
This insight was already present in Drexler’s work, but however insightful Eric is in person, CAIS is completely unreadable and so no one cared. But this post is well written. Not perfectly once again, but it gives short, somewhat minimal proofs of concept for this structural perspective on alignment. And it also managed to tie alignment with key ideas in sociology, opening ways for interdisciplinarity.
I have made every person I have ever mentored on alignment study this post. And I plan to continue doing so. Despite the fact that I’m unconvinced by most timeline and AI risk scenarios post. That’s how good and important it is.
- 20 Jan 2023 20:54 UTC; 8 points) 's comment on The 2021 Review Phase by (
- 27 Jan 2023 23:35 UTC; 2 points) 's comment on Highlights and Prizes from the 2021 Review Phase by (
I agree that a lot of science relies on predictive hallucinations. But there are examples that come to mind, notably the sort of phenomenological compression pushed by Faraday and (early) Ampère in their initial exploration of electromagnetism. What they did amounted to vary a lot of the experimental condition and relate outcomes and phenomena to each other, without directly assuming any hidden entity. (see this book for more details)
More generally, I expect most phenomenological laws to not rely heavily on predictive hallucinations, even when they integrate theoretical terms in their formulation. That’s because they are mostly strong experimental regularities (like the ideal gas law or the phenomenological laws of thermodynamics) which tend to be carried to the next paradigm with radically different hidden entities.
So reification means “the act of making real” in most english dictionaries (see here for example). That’s the meaning we’re trying to evoke here, where the reification bias amounts to first postulate some underlying entity that explain the phenomena (that’s merely a modelling technique), and second to ascribe reality to this entity and manipulate it as if it was real.
You use the analogy with sports betting multiple time in this post. But science and sports are disanalogical in almost all the relevant ways!
Notably, sports are incredibly limited and well-defined, with explicit rules that literally anyone can learn, quick feedback signals, and unambiguous results. Completely the opposite of science!
The only way I see for the analogy to hold is by defining “science” in a completely impoverished way, that puts aside most of what science actually looks like. For example, replication is not that big a part of acience, it’s just the visible “clean” one. And even then, I expect the clarification of replication issues and of the original meaning to be tricky.
So my reaction to this proposal, like my reaction to any prediction market for things other than sports and games, is that I expect it to be completely irrelevant to the progress of knowledge because of the weakness of such tools. But I would definitely be curious of attempts to explicitly address all the ambiguities of epistemology and science through betting mechanisms. Maybe you know of some posts/works on that?
Agreed! That’s definitely and important point, and one reason why it’s still interesting to try to prove P \neq NP. The point I was making here was only that when proofs are used for the “certainty” that they give, then strong evidence from other ways is also enough to rely on the proposition.
What are you particularly interested in? I expect I could probably write it with a bit of rereading.
Hot take: I would say that most optimization failures I’ve observed in myself and in others (in alignment and elsewhere) boil down to psychological problems.
Completely agree! The point is not that formalization or axiomatization is always good, but rather to elucidate one counterintuitive way in which it can be productive, so that we can figure out when to use it.
Thanks for your thoughtful comment!
First, I want to clarify that this is obviously not the only function of formalization. I feel like this might clarify a lot of the point you raise.
But first, the very idea that formalization would have helped discover non-Euclidean geometries earlier seems counter to the empirical observation that Euclid himself formalized geometry with 5 postulates, how more formal can it get? Compared to the rest of the science of the time, it was a huge advance. He also saw that the 5th one did not fit neatly with the rest. Moreover, the non-Euclidean geometry was right there in front of him the whole time: spheres are all around. And yet the leap from a straight line to the great circle and realizing that his 4 postulates work just fine without the 5th had to wait some two millennia.
So Euclid formalized our geometric intuitions, the obvious and immediate shape that make naturally sense of the universe. This use of formalization was to make more concrete and precise some concepts that we had but that were “floating around”. He did it so well that these concepts and intuition acquired an even stronger “reality” and “obviousness”: how could you question geometry when Euclid had made so tangible the first intuitions that came to your mind?
According to Bachelard, the further formalization, or rather the axiomatization of geometry, of simplifying the apparently simple concepts of points and lines to make them algebraically manipulable, was a key part in getting out of this conceptual constraint.
That being said, I’d be interested for an alternative take or evidence that this claim is wrong. ;)
In general, what you (he?) call “suspension of intuition”, seems to me to be more like emergence of a different intuition after a lot of trying and failing. I think that the recently empirically discovered phenomenon of “grokking” in ML provides a better model of how breakthroughs in understanding happen. It is more of a Hegelian/Kuhnian model of phase transitions after a lot of data accumulation and processing.
This strike me as a false comparison/dichotomy: why can’t both be part of scientific progress? Especially in physics and chemistry (the two fields Bachelard knew best), there are many examples of productive formalization/axiomatization as suspension of intuition:
Bolzmann work that generally started from mathematical building blocks, build stuff from them, and then interpreted them. See this book for more details of this view.
Quantum Mechanics went through that phase, where the half-baked models based on classical mechanics didn’t work well enough, and so there was an effort at formalization and axiomatization that revealed the underlying structure without as much pollution by macroscopic intuition.
The potential function came from a pure mathematical and formal effort to compress the results of classical mechanics, and ended up being incorporated in the core concepts of physics.
I’ve also found out that on inspection, models of science based on the gathering of a lot of data rarely fit the actual history. Notably Kuhn’s model contradicts the history of science almost everywhere, and he makes a highly biased reading of the key historic events that he leverages.
That definitely feels right, with a caveat that is dear to Bachelard: this is a constant process of rectification that repeats again and again. There is no ending, or the ending is harder to find that what we think.
I’m confused by your confusion, given that I’m pretty sure you understand the meaning of cognitive bias, which is quite explicitly the meaning of bias drawn upon here.
Thanks for your comment!
Actually, I don’t think we really disagree. I might have just not made my position very clear in the original post.
The point of the post is not to say that these activities are not often valuable, but instead to point out that they can easily turn into “To do science, I need to always do [activity]”. And what I’m getting from the examples is that in some cases, you actually don’t need to do [activity]. There’s a shortcut, or maybe just you’re in a different phase of the problem.
Do you think there is still a disagreement after this clarification?
In a limited context, the first example that comes to me is high performers in competitive sports and games. Because if they truly only give a shit about winning (and the best generally do), they will throw away their legacy approaches when they find a new one, however it pains them.
Thanks for the pointer!