Gerald Monroe comments on Slaying the Hydra: toward a new game board for AI

Gerald Monroe 23 Jun 2023 20:04 UTC
3 points
0
I agree with this criticism. What you have done in this design is to create a large bureaucracy of AI systems who essentially will not respond when anything unexpected happens (input outside training distribution) and who are inflexible, anything other than the task assigned at the moment is “not my job/not my problem”. They can have superintelligent subtask performance and the training set can include all available video on earth so they can respond to any situation they have ever seen humans performing, so it’s not as inflexible as it might sound. This is going to work extremely well I think compared to what we have now.

But yes, if this doesn’t allow you to get close to what the limits are for what intelligence allows you to do, “unrestricted” systems might win. It depends.

As a systems engineer myself I don’t see unrestricted systems going anywhere, the issue isn’t that they could be cognitively capable of a lot, it’s that in the near term when you try to use them they will make too many mistakes to trust them with anything that matters. And they are uncorrectable errors, without a structure like described there is a lot of design coupling and making the system better at one thing with feedback comes at a cost elsewhere etc.

It’s easy to talk about an AI system that has some enormous architecture with a thousand modules more like a brain, and it learns online from all the tasks it is doing. Hell it has a module editor so it can add additional whenever it chooses.

But...how do you validate or debug such a system? It’s learning from all inputs, it’s a constantly changing technological artifact. In practice this is infeasible, when it makes a catastrophic error there is nothing you can do to fix it. Any test set you add to train it on the scenario it made a mistake on is not guaranteed to fix the error because the system is ever evolving...

Forget alignment, getting such a system to reliably drive a garbage truck would be risky.

I can’t deny such a system might work, however...

Well, crap. It might work extremely well and outperform everything else, becoming more and more unauditable over time. Because we humans would apply simulated tests for performance as constrains as well as real world kpis. “Add whatever modules to yourself however you want, just ace these tests and do well on your kpis and we don’t care how you do it...”

That’s precisely how you arrive a machine that is internally unrestricted and thus potentially an existential threat.