How do you define transformative AI? If ChatGPT gets 10x better (e.g. it can write most code, answer most questions as good as the experts in a subject, etc) -- would this qualify?
How would you even force an AI to use the weights (simplification) that correspond to the fact vs. fiction anyways?
Also, what really is the difference between our history textbooks and our fiction to an AI that’s just reading a bunch of text? I’m not being flippant. I’m genuinely wondering here! If you don’t imbue these models with an explicit world-model, why would one be always privileged over the other?
Problem: there is no non-fiction about human-level AIs. The training data for LLMs regarding human-level AIs contains only fiction. So consider the hypotheses of chatGPT. In what context encountered in its training data is it most likely to encounter text like “you are Agent, a friendly aligned AI...” followed by humans asking it to do various tasks? Probably some kind of weird ARG. In current interactions with chatGPT, it’s quite possibly just LARPing as a human LARPing as a friendly AI. I don’t know if this is good or bad for safety, but I have a feeling this is a hypothesis we can test.
How do you define transformative AI? If ChatGPT gets 10x better (e.g. it can write most code, answer most questions as good as the experts in a subject, etc) -- would this qualify?
How would you even force an AI to use the weights (simplification) that correspond to the fact vs. fiction anyways?
Also, what really is the difference between our history textbooks and our fiction to an AI that’s just reading a bunch of text? I’m not being flippant. I’m genuinely wondering here! If you don’t imbue these models with an explicit world-model, why would one be always privileged over the other?
Problem: there is no non-fiction about human-level AIs. The training data for LLMs regarding human-level AIs contains only fiction. So consider the hypotheses of chatGPT. In what context encountered in its training data is it most likely to encounter text like “you are Agent, a friendly aligned AI...” followed by humans asking it to do various tasks? Probably some kind of weird ARG. In current interactions with chatGPT, it’s quite possibly just LARPing as a human LARPing as a friendly AI. I don’t know if this is good or bad for safety, but I have a feeling this is a hypothesis we can test.