I think it’s certainly an interesting attempt at the problem. GPT4 thinks it’s an interesting start—I’m a bit more skeptical; here’s what GPT4 had to say:
Your idea is interesting and has some merits, but there are also potential challenges to consider. Here are a few points to think about:
Training on accuracy, consistency, and completeness: While these are essential aspects for reasoning, focusing only on them might not be sufficient to ensure a safe AI. If an AI model is not trained on alignment issues, it may not develop an understanding of human values and ethical considerations. This could lead to the generation of logical reasoning graphs that are accurate, consistent, and complete but might still suggest actions that are harmful to humans or against our values.
Layered approach to analyzing the graph: The idea of using a series of systems to analyze the AI-generated graph and ensure that the suggested actions are safe is promising. However, this approach could face several challenges: a. Ensuring that the systems responsible for analyzing the graph are reliable and safe, especially when dealing with unknown consequences or complex ethical dilemmas. b. The possibility that the AI model learns to generate reasoning graphs that are difficult to analyze by the subsequent systems, which could lead to the AI finding loopholes in the analysis process. c. The potential for adversarial attacks or other external influences that could compromise the integrity of the systems responsible for analyzing the graph.
Non-AI system processing actions: While this might help mitigate the risk of a lying AI, it could also introduce new risks. The non-AI system could be vulnerable to manipulation or human error, which could lead to undesirable outcomes. Additionally, the non-AI system might lack the flexibility and adaptability of an AI system, potentially limiting its ability to address complex or rapidly changing situations.
Scalability: As the complexity of the problems the AI is trying to solve increases, the size and complexity of the reasoning graphs might become unmanageable. Developing systems capable of analyzing and verifying large-scale reasoning graphs might be a significant challenge.
I don’t think this quite covers the key issues with the proposal, though; Where are you going to get training data that can decompose into logical actions? How are you going to verify logical correctness? Logical reasoning relies on defining a formal language that constrains valid moves to be only those that retain validity; in messy real life environments this usually either buys you nothing or requires a prohibitively huge space of actions that relies on a strong approximate model to build. As GPT4 says, you then run into significant issues processing the output of the system.
Spent 2m writing comment. Strong agreed on Raemon’s comment.
I think it’s certainly an interesting attempt at the problem. GPT4 thinks it’s an interesting start—I’m a bit more skeptical; here’s what GPT4 had to say:
I don’t think this quite covers the key issues with the proposal, though; Where are you going to get training data that can decompose into logical actions? How are you going to verify logical correctness? Logical reasoning relies on defining a formal language that constrains valid moves to be only those that retain validity; in messy real life environments this usually either buys you nothing or requires a prohibitively huge space of actions that relies on a strong approximate model to build. As GPT4 says, you then run into significant issues processing the output of the system.
Spent 2m writing comment. Strong agreed on Raemon’s comment.