Waking up to reality. No, not that one. We’re still dreaming.
Aleksi Liimatainen
I for one found this post insightful though I wouldn’t necessarily call it a book review.
Going against the local consensus tends to go over better when it’s well-researched and carefully argued. This one unfortunately reads as little more than an expression of opinion, and an unpopular one at that.
Yeah, this seems close to the crux of the disagreement. The other side sees a relation and is absolutely puzzled why others wouldn’t, to the point where that particular disconnect may not even be in the hypothesis space.
When a true cause of disagreement is outside the hypothesis space the disagreement often ends up attributed to something that is in the hypothesis space, such as value differences. I suspect this kind of attribution error is behind most of the drama I’ve seen around the topic.
Nathaniel is offering scenarios where the problem with the course of action is aesthetic in a sense he finds equivalent. Your question indicates you don’t see the equivalence (or how someone else could see it for that matter).
Trying to operate on cold logic alone would be disastrous in reality for map-territory reasons and there seems to be a split in perspectives where some intuitively import non-logic considerations into thought experiments and others don’t. I don’t currently know how to bridge the gap given how I’ve seen previous bridging efforts fail; I assume some deep cognitive prior is in play.
My suspicion is that it has to do with cultural-cognitive developments generally filed under “religion”. As it’s little more than a hunch and runs somewhat counter to my impression of LW mores, I hesitate to discuss it in more depth here.
Conditional on living in an alignment-by-default Universe, the true explanations for individual and societal human failings must be consistent with alignment-by-default. Have we been deviated from the default by some accident of history or does alignment just look like a horrid mess somehow?
You’re describing an alignment failure scenario, not a success scenario. In this case the AI has been successfully instructed to paperclip-maximize a planned utopia (however you’d do that while still failing at alignment). Successful alignment would entail the AI being able and willing to notice and correct for an unwise wish.
I don’t think it’s possible to evaluate a model without inhabiting it. Therefore we must routinely accept (and subsequently reject) propositions.
Michal Levin’s paper The Computational Boundary of a “Self” seems quite relevant re identity fusion. The paper argues that larger selves emerge rather readily in living systems but it’s not quite clear to me whether that would be an evolved feature of biology or somehow implicit in cognition-in-general. Disambiguating that seems like an important research topic.
As someone with unusual ideas and an allergy to organizational drift, I approve of this message.
I don’t think we currently have organizational patterns capable of fully empowering individual choicemaking. Any organization comes with narrowing of purpose, implicitly or explicitly, and this particularly detrimental to most out-there ideas. Not all out-there ideas are good but those that are should be pursued with full flexibility.
I think the biggest thing holding AI alignment back is a lack of general theory of alignment. How do extant living system align, and what to?
The Computational Boundary of a “Self” paper by Michael Levin seems to suggest one promising line of inquiry.
For some reason I find it important to consider the infrastructure metaphor applied to humans. How would you yourself fare if treated as infrastructure?
Best guess as to the origin of the feeling: I have an intuition that, carelessly applied, the infrastructure view neglects the complexity of its target and risks unfortunate unintended consequences down the line.
Seems to me that those weird power dynamics have deleterious effects even if countervailing forces prevent the group from outright imploding. It’s a tradeoff to engage with such institutions on their own terms and these days a nontrivial number of people seem to choose not to.
If such a thing existed, how could we know?
Regardless of the object level merits of such topics, it’s rational to notice that they’re inflammatory in the extreme for the culture at large and that it’s simply pragmatic (and good manners too!) to refrain from tarnishing the reputation of a forum with them.
I also suspect it’s far less practically relevant than you think and even less so on a forum whose object level mission doesn’t directly bear on the topic.
Learning networks are ubiquituous (if it can be modeled as a network and involves humans or biology it almost certainly is one) and the ones inside our skulls are less of a special case than we think.
If the neocortex is a general-purpose learning architecture as suggested by Numenta’s Thousand Brains Theory of Intelligence, it becomes likely that cultural evolution has accumulated significant optimizations. My suspicion is that learning on cultural corpus progresses rapidly until somewhat above human level and then plateaus to some extent. Further progress may require compute and learning opportunities more comparable to humanity-as-a-whole than individuals.
(copied from my tweet)
AI alignment is a wicked problem. It won’t be solved by any approach that fails to grapple with how deeply it mirrors self-alignment, child alignment, institutional alignment and many others.
I feel like this “back off and augment” is downstream of an implicit theory of intelligence that is specifically unsuited to dealing with how existing examples of intelligence seem to work. Epistemic status: the idea used to make sense to me and apparently no longer does, in a way that seems related to the ways i’ve updated my theories of cognition over the past years.
Very roughly, networking cognitive agents stacks up to cognitive agency at the next level up easier than expected and life has evolved to exploit this dynamic from very early on across scales. It’s a gestalt observation and apparently very difficult to articulate into a rational argument. I could point to memory in gene regulatory networks, Michael Levin’s work in nonneural cognition, trainability of computational ecological models (they can apparently be trained to solve sudoku), long term trends in cultural-cognitive evolution, and theoretical difficulties with traditional models of biological evolution—but I don’t know how to make the constellation of data points easily distinguishable from pareidolia.