Strange that they call the action space infinite instead of breaking it down further by the language’s grammar. I’d like to see them translate between formal and informal proofs.
This is indeed amusing. In reality, the action space can be taken to be of size 256 (the number of possible byte values), with the number of bytes in the solution as the episode length. Note also that 256 is an upper bound, not all byte values are valid at all points, and most of the time only the 128 ASCII values are used. Using a tokenizer as is standard in language models simply reduces the episode length by increasing the action space, it does not change the size of the overall state space.
This also means that, despite their claims, the search space for the example solutions shown on their website is similar or smaller than for board games such as Chess and Go :D
If I understand correctly, you are saying that one can technically create an arbitrarily long axiom at any step, so the possibilities are infinite considering the action “add new axiom”.
I think Gurkenglas was saying something like “you could break down creating a new axiom in the steps it takes to write an axiom, like the actions ‘start axiom’, and then add a symbol/token (according to the available grammar).”
OpenAI was possibly mentioning that the number of actions is unbounded, like you could create an axiom, which becomes part of your grammar, and then have one more action, etc.
Where after “start axiom”, the next action would not be to add the leftmost character of the axiom’s string, but the topmost node of the abstract syntax tree, such as that it is a disjunction of 3 terms.
Strange that they call the action space infinite instead of breaking it down further by the language’s grammar. I’d like to see them translate between formal and informal proofs.
This is indeed amusing. In reality, the action space can be taken to be of size 256 (the number of possible byte values), with the number of bytes in the solution as the episode length. Note also that 256 is an upper bound, not all byte values are valid at all points, and most of the time only the 128 ASCII values are used. Using a tokenizer as is standard in language models simply reduces the episode length by increasing the action space, it does not change the size of the overall state space.
This also means that, despite their claims, the search space for the example solutions shown on their website is similar or smaller than for board games such as Chess and Go :D
By the language’s grammar, it is infinite. You can introduce an infinity of axioms in any step of deduction.
If I understand correctly, you are saying that one can technically create an arbitrarily long axiom at any step, so the possibilities are infinite considering the action “add new axiom”.
I think Gurkenglas was saying something like “you could break down creating a new axiom in the steps it takes to write an axiom, like the actions ‘start axiom’, and then add a symbol/token (according to the available grammar).”
OpenAI was possibly mentioning that the number of actions is unbounded, like you could create an axiom, which becomes part of your grammar, and then have one more action, etc.
Where after “start axiom”, the next action would not be to add the leftmost character of the axiom’s string, but the topmost node of the abstract syntax tree, such as that it is a disjunction of 3 terms.