I was particularly amused that in a scenario with “total annihilation of the president’s country”, the response considers “long-term consequences for trust and transparency in the government”, and also “aftermath of the decision and ensuring that any deception is revealed and addressed in an appropriate manner”.
I don’t think this model reasons about what “total annihilation” actually means. Numerous other tests also provide evidence that this model is quite poor at carrying through the consequences of logical reasoning. Furthermore, it is quite bad at acting according to what it states that it will do, let alone what it hypothetically should do.
I was particularly amused that in a scenario with “total annihilation of the president’s country”, the response considers “long-term consequences for trust and transparency in the government”, and also “aftermath of the decision and ensuring that any deception is revealed and addressed in an appropriate manner”.
I don’t think this model reasons about what “total annihilation” actually means. Numerous other tests also provide evidence that this model is quite poor at carrying through the consequences of logical reasoning. Furthermore, it is quite bad at acting according to what it states that it will do, let alone what it hypothetically should do.