Modern AI systems (read: LLMs) don’t look like that, so Tammy thinks efforts to align them are misguided.
Starting your plan with “ignore all SOTA ML research” doesn’t sound like a winning proposition.
The important thing is, your math should be safe regardless of what human values turn out to really be, but you still need lots of info to pin those values down.
I don’t think even hardcore believers in the orthogonality thesis believe all possible sets of values are equally easy to satisfy. Starting with a value-free model is throwing away a lot of baby with the bath-water.
Moreover, explicitly having a plan of “we can satisfy any value system” leaves you wide open for abuse (and goodharting). Much better to aim from the start for a broadly socially agreeable goal like curiosity, love or freedom.
Look at every possible computer program.
I’m sure the author is aware that Solomonoff Induction is non-computable, but this article still fails to appreciate how impossible it is to even approximate. We could build a galaxy sized super-computer and still wouldn’t be able to calculate BB(7)
Overall, I would rate the probability of this plan succeeding at 0%
Ignores the fact that there’s no such thing in the real world as a “strong coherent general agent”. Normally I scoff at people who say things like “the word intelligence isn’t defined” but this is a case where the distinction between “can do anything a human can do” and “general purpose intelligence” really matters.
Value-free alignment is simultaneously impossible and dangerous
Refuses to look “inside the box” thereby throwing away all practical tools for alignment
Completely ignores everything we’ve actually learned about AI in the last 4 years
Requires calculation of an impossible to calculate function
Generally falls into the MIRI/rationalist tarpit of trying to reason about intelligent systems without actually working with/appreciating actually working systems in the real word
A lot of your arguments boil down to “This ignores ML and prosaic alignment” so I think it would be helpful if you explained why ML and prosaic alignment are important.
The obvious reply would be that ML now seems likely to produce AGI, perhaps alongside minor new discoveries, in a fairly short time. (That at least is what EY now seems to assert.) Now, the grandparent goes far beyond that, and I don’t think I agree with most of the additions. However, the importance of ML sadly seems well-supported.
Starting your plan with “ignore all SOTA ML research” doesn’t sound like a winning proposition.
I don’t think even hardcore believers in the orthogonality thesis believe all possible sets of values are equally easy to satisfy. Starting with a value-free model is throwing away a lot of baby with the bath-water.
Moreover, explicitly having a plan of “we can satisfy any value system” leaves you wide open for abuse (and goodharting). Much better to aim from the start for a broadly socially agreeable goal like curiosity, love or freedom.
I’m sure the author is aware that Solomonoff Induction is non-computable, but this article still fails to appreciate how impossible it is to even approximate. We could build a galaxy sized super-computer and still wouldn’t be able to calculate BB(7)
Overall, I would rate the probability of this plan succeeding at 0%
Ignores the fact that there’s no such thing in the real world as a “strong coherent general agent”. Normally I scoff at people who say things like “the word intelligence isn’t defined” but this is a case where the distinction between “can do anything a human can do” and “general purpose intelligence” really matters.
Value-free alignment is simultaneously impossible and dangerous
Refuses to look “inside the box” thereby throwing away all practical tools for alignment
Completely ignores everything we’ve actually learned about AI in the last 4 years
Requires calculation of an impossible to calculate function
Generally falls into the MIRI/rationalist tarpit of trying to reason about intelligent systems without actually working with/appreciating actually working systems in the real word
A lot of your arguments boil down to “This ignores ML and prosaic alignment” so I think it would be helpful if you explained why ML and prosaic alignment are important.
The obvious reply would be that ML now seems likely to produce AGI, perhaps alongside minor new discoveries, in a fairly short time. (That at least is what EY now seems to assert.) Now, the grandparent goes far beyond that, and I don’t think I agree with most of the additions. However, the importance of ML sadly seems well-supported.