Some thoughts on George Hotz vs Eliezer Yudkowsky
Just watched this debate with George Hotz and Eliezer Yudkowsky. Here’s a few of my thoughts.
I think GH’s claim that timelines are important was right, but he didn’t engage with EY’s point that we can’t predict the timeline, and instead GH pulled out an appeal to ignorance, claiming that because we don’t know the timeline, therefore it is many decades off.
I found the discussion of FOOM was glossed over too quickly, I think EY was wrong about it not containing any disagreement. They instead both spent a bunch of time arguing about superhuman agent’s strategies even though they both agreed that human level intelligence cannot predict superhuman intelligence. Or, rather, Eliezer claimed we can make statements about superhuman actions of the type “the superhuman chess robot will make moves that cause it to win the chess game.” A framing which seemed to drop out during heated speculation of superhuman AI behaviors.
It seemed to me like a lot of GH’s argument rested on the assumption that AI’s will be like humans, and will therefore continue the economy and be human friendly, which seemed really unsubstantiated. It would be interesting to hear more about why he expects that, and if he’s read any of the theory on why we shouldn’t expect that. Though, I believe a lot of that theory was from before much of RL and mesa optimizer theory and so needs revisiting.
I wasn’t impressed at all by the notion that AI won’t be able to solve the prisoners dilemma,
First because I don’t think we have superhuman knowledge of what things actually are unsolvable,
And also because I don’t think negotiation skill is capped by the prisoners dilemma being impossible (AI could still negotiate better than humans even if they can’t form crypto stable pacts),
And finally because even if they don’t cooperate with each other that still doesn’t say anything about them being human friendly.
One of GH’s points that I don’t think got enough attention was that the orthogonality thesis applies to Humans as well, although he seemed to be unaware of shard theory, which I believe is a result of mesa optimizer theory with some influence from neuroscience (probably not enough influence). But even so, that’s not relevant to the discussion of AI x-risk, the context in which GH was using it.
Overall, GH did not say anything that changed my view of the AI x-risk situation, he seemed to say “because it’s possible for things to turn out ok, they definitely will” which seems like an irresponsible view.
Eliezer also didn’t seem to argue with much skill. I’m not saying I could do better, but I don’t expect GH’s views to have been shifted significantly by the debate… I’ve got an idea though, we just need to build a superhuman debate engine and then use it to convince the accelerationists that building misaligned AI is a bad idea. I kid, please please nobody build and deploy superhuman debate engines.
But actually, mapping the debate around AI x-risk seems really worthwhile, as does the idea of building software tools to make arguments easier to organize and understand. Anyone know of good projects doing that? I’d love to hear about them.
The best is probably AI risk interview perspectives, an “interactive walkthrough exploring potential risks from advanced AI” created from interviews with 97 AI researchers. The project lead Vael Gates posted about this here awhile back.
I also like Talk to the City, “an interactive LLM tool to improve collective decision-making by finding the key viewpoints and cruxes in any discourse”.
Oh, actually I spoke too soon about “Talk to the City.” As a research project, it is cool, but I really don’t like the obfuscation that occurs when talking to an LLM about the content it was trained on. I don’t know how TTTC works under the hood, but I was hoping for something more like de-duplication of posts, automatically fitting them into argument graphs. Then users could navigate to relevant points in the graph based on a text description of their current point of view, but importantly they would be interfacing with the actual human generated text, with links back to it’s source, and would be able to browse the entire graph. People could then locate (visually?) important crux’s and new crux’s wouldn’t require a writeup to disseminate, but would already be embedded in the relevant part of the argument.
( I might try to develop something like this someday if I can’t find anyone else doing it. )
The risk interview perspectives is much closer to what I was thinking, and I’d like to study it in more detail, but seems more like a traditional analysis / infographic than what I am wishing would exist.
Yesssss! These look cool : ) Thank you.