No matter what, the first real test flight will be full of passengers.
It didn’t work that way with 747s. They did loads of testing before risking hundreds of lives.
747s aren’t smart enough to behave differently when they do or don’t have passengers. If the AI might be behaving differently when it’s boxed then unboxed, then any boxed test isn’t “real”; unboxed tests “have passengers”.
747s aren’t smart enough to behave differently when they do or don’t have passengers. If the AI might be behaving differently when it’s boxed then unboxed, then any boxed test isn’t “real”; unboxed tests “have passengers”.
Sure, but that’s no reason not to test. It’s a reason to try and make the tests realistic.
The point is not that we shouldn’t test. The point is that tests alone don’t give us the assurances we need.