cousin_it comments on Some thoughts on AI, Philosophy, and Safety

cousin_it 26 Dec 2011 14:46 UTC
0 points

An interesting question suggested by the post is what true statements about itself can an agent produce in that setting, assuming it can control which statements to withhold (for how long or under what conditions), even if it knows them to be true.

Gödel’s theorems apply to effectively generated theories, which means precisely that there is a program printing out all the axioms (or, equivalently, all the provable theorems). So even if an agent is very crafty about this sort of control, it can’t become exempt from Gödel. And I don’t see why it would produce something much stronger than our existing axiomatic systems. So why bother?
- Vladimir_Nesov 26 Dec 2011 14:54 UTC
  0 points
  Parent
  
  So why bother?
  
  To capture some aspects of theory-building and incompleteness associated with decision-making. I agree that all of agent’s activity could just be captured in a stronger system that anticipates all of its decisions from the start, but agent’s own decision-making seems to be modeled best as using weaker theories that can’t predict the next decisions; and after each decision is made, it should become known to plan further.
  
  So the reason why the question is interesting is not because it leads to stronger theories, but because it’s concerned with capturing the effect of decisions on theories involved in making them.
  - cousin_it 26 Dec 2011 15:23 UTC
    0 points
    Parent
    If this approach doesn’t lead to stronger theories, then the game described in Paul’s post doesn’t seem to be strong motivation for studying it. Can you think of some other game where it would be useful?
    - Vladimir_Nesov 26 Dec 2011 15:33 UTC
      0 points
      Parent
      Sure, not a strong motivation, that’s why I said “question suggested by the post” (we are talking of two different questions now...). The second question seems to be mainly about interpreting agent’s activity, not directly about construction of agents, so I don’t expect a game that clearly requires this kind of analysis. (For example, all agents are “just programs” (perhaps with “just oracles”), so any talk of formal systems used by agents is something in addition, a method of understanding them, even if it results in useful designs.)
      
      On the other hand, if we want the agent to remain ignorant of its own actions, adding axioms that decide the actions to agent’s own theory from the very start is not an option, so describing a procedure for adding them later is a way of keeping both of the requirements: a strong theory and consistency of counterfactuals at each decision.