You acknowledge this but I feel you downplay the risk of cancer—an accidental point mutation in a tumour suppressor gene or regulatory region in a single founder cell could cause a tumour.
For each target the likely off-targets can be predicted, allowing one to avoid particularly risky edits. There may still be issues with sequence-independent off-targets, though I believe these are a much larger problem with base editors than with prime editors (which have lower off-target rates in general). Agree that this might still end up being an issue.
Unless you are using the term “off-target” to refer to any incorrect edit of the target site, and wider unwanted edits—in my community this term referred specifically to ectopic edits elsewhere in the genome away from the target site.
This is exactly it—the term “off-target” was used imprecisely in the post to keep things simple. The thing we’re most worried about here is misedits (mostly indels) at noncoding target sites. We know a target site does something (if the variant there is in fact causal), so we might worry that an indel will cause a big issue (e.g. disabling a promoter binding site). Then again, the causal variant we’re targeting has a very small effect, so maybe the sequence isn’t very sensitive and an indel won’t be a big deal? But it also seems perfectly possible that the sequence could be sensitive to most mutations while permitting a specific variant with a small effect. The effect of an indel will at least probably be less bad than in a coding sequence, where it has a high chance of causing a frameshift mutation and knocking out the coded-for protein.
The important figure of merit for editors with regards to this issue is the ratio of correct edits to misedits at the target site. In the case of prime editors, IIUC, all misedits at the target site are reported as “indels” in the literature (base editors have other possible outcomes such as bystander edits or conversion to the wrong base). Some optimized prime editors have edit:indel ratios of >100:1 (best I’ve seen so far is 500:1, though IIUC this was just at two target sites, and the rates seem to vary a lot by target site). Is this good enough? I don’t know, though I suspect not for the purposes of making a thousand edits. It depends on how large the negative effects of indels are at noncoding target sites: is there a significant risk the neuron gets borked as a result? It might be possible to predict this on a site-by-site basis with a better understanding of the functional genomics of the sequences housing the causal variants which affect polygenic traits (which would also be useful for finding the causal variants in the first place without needing as much data).
I think I mostly agree with the critique of “pause and do what, exactly?”, and appreciate that he acknowledged Yudkowsky as having a concrete plan here. I have many gripes, though.
I’m thoroughly unimpressed with these paragraphs. It’s not completely clear what the “argument” is from the first paragraph, but I’m interpreting it as “superintelligence might be created soon and cause human extinction if not aligned, therefore we should stop”.
Firstly, there’s an obvious conjunction fallacy thing going on where he broke the premises down into a bunch of highly correlated things and listed them separately to make them sound more far fetched in aggregate. E.g. the 3 claims:
[that superintelligence is] practical to build
that superintelligence can be built by current research and development methods
that recent chatbot-style AI technologies are a major step forward on the path to superintelligence
are highly correlated. If you believe (1) there’s a good chance you believe (2), and if you believe (2) then you probably believe (3).
There’s also the fact that (3) implies (2) and (2) implies (1), meaning (3) is logically equivalent to (1) AND (2) AND (3). So why not just say (3)?
I’m also not sure why (3) is even a necessary premise; (2) should be cause enough for worry.
I have more gripes with these paragraphs:
What is this even doing here? I’d offer AIXI as a very concrete existence proof of philosophical possibility. Or to be less concrete but more correct: “something epistemically and instrumentally efficient relative to all of humanity” is a simple coherent concept. He’s only at “pretty likely but not proven” on this?? What would it even mean for it to be “philosophically impossible”?
Huh? Why would alignment not being achievable by “a very difficult engineering program” mean we shouldn’t worry?
It just needs to be powerful enough to replace and then kill us. For example we can very confidently predict that it won’t be able to send probes faster than light, and somewhat less confidently predict that it won’t be able to reverse a secure 4096 bit hash.
Here’s a less conjunction-fallacy-y breakdown of the points that are generally contentious among the informed:
humans might soon build an intelligence powerful enough to cause human extinction
that superintelligence would be “unfriendly” to humanity by default
Some other comments:
Human intelligence variation is looking to be pretty simple on a genetic level: lots of variants with small additive effects. (See e.g. this talk by Steve Hsu)
No reason at all? If Yudkowsky is in fact correct, wouldn’t we expect people to predictably come to agree with him as we made them smarter (assuming we actually succeeded at making them smarter in a broad sense)? If we’re talking about adult enhancement, you can also just start out with sane, cautious people and make them smarter.
I hope I’ve convinced the skeptical reader that the premises aren’t all that absurd?