The main problem here is “how to elicit simulacra of superhuman aligned intelligence while avoiding Waluigi effect”. We don’t have aligned superintelligence in training data and any attempts to elicit superintelligence from LLM can be fatal.
The main problem here is “how to elicit simulacra of superhuman aligned intelligence while avoiding Waluigi effect”. We don’t have aligned superintelligence in training data and any attempts to elicit superintelligence from LLM can be fatal.