Concretely, I have seen this style of test (for want of better terms, natural language code emulation) used as a screening test by firms looking to find non-CS undergraduates who would be well suited to develop code.
In as much as this test targets indirection, it is comparatively easy to write tests which target data driven flow control or understanding state machines. In such a case you read from a fixed sequence and emit a string of outputs. For a plausible improvement, get the user to log the full sequence of writes, so that you can see on which instruction things go wrong.
There also seem to be aspects of coding which are not simply being technically careful about the formal function of code. The most salient to me would be taking an informally specified natural language problem and reducing it to operations one can actually do. Algorithmic / architectural thinking seems at least as rare as fastidiousness about code.
Concretely, I have seen this style of test (for want of better terms, natural language code emulation) used as a screening test by firms looking to find non-CS undergraduates who would be well suited to develop code.
In as much as this test targets indirection, it is comparatively easy to write tests which target data driven flow control or understanding state machines. In such a case you read from a fixed sequence and emit a string of outputs. For a plausible improvement, get the user to log the full sequence of writes, so that you can see on which instruction things go wrong.
There also seem to be aspects of coding which are not simply being technically careful about the formal function of code. The most salient to me would be taking an informally specified natural language problem and reducing it to operations one can actually do. Algorithmic / architectural thinking seems at least as rare as fastidiousness about code.