The craziness it produced was not code, it merely looked like code. It’s a neat example, but in that particular case not much better than an N-gram markov chain.
How much understanding should we expect from even a powerful AI, though? All it’s being fed is a long stream of C text, with no other information than that—it gets no runtime output, no binary equivalents, no library definitions, no feedback on its own compression output… I’m not sure what a human with no knowledge of programming would learn in this context either other than to write C-looking gibberish (which, unlike generated images or music, we are not much interested in the esthetics of). The RNN might be doing extremely well, it’s hard to say.
It would be a better criticism if, working on parse trees or something, RNNs could be shown to be unable to learn to write programs which satisfy specified properties. (Something like the neural TM work but less low-level.) Or anything really, which involves asking the RNN to do something, rather than basically make the RNN hallucinate and debate how realistic its hallucinations look.
Indeed, parse trees would be the way to go. There is already a field of genetic algorithms, so one would see how they work and combine this with the RNNs. Humans rarely write code that runs correctly or even complies the first time, and similarly the RNNs could improve the program iteratively.
I’d say the RNN is doing well to produce pretend code of this quality, as Antisuji says below.
Syntactically it’s quite a bit better than an N-gram markov chain: it gets indentation exactly right, it balances parentheses, braces, and comment start/end markers, delimits strings with quotation marks, and so on. You’re right that it’s no better than a markov chain at understanding the “code” it’s producing, at least at the level a human programmer does.
The craziness it produced was not code, it merely looked like code. It’s a neat example, but in that particular case not much better than an N-gram markov chain.
How much understanding should we expect from even a powerful AI, though? All it’s being fed is a long stream of C text, with no other information than that—it gets no runtime output, no binary equivalents, no library definitions, no feedback on its own compression output… I’m not sure what a human with no knowledge of programming would learn in this context either other than to write C-looking gibberish (which, unlike generated images or music, we are not much interested in the esthetics of). The RNN might be doing extremely well, it’s hard to say.
It would be a better criticism if, working on parse trees or something, RNNs could be shown to be unable to learn to write programs which satisfy specified properties. (Something like the neural TM work but less low-level.) Or anything really, which involves asking the RNN to do something, rather than basically make the RNN hallucinate and debate how realistic its hallucinations look.
Indeed, parse trees would be the way to go. There is already a field of genetic algorithms, so one would see how they work and combine this with the RNNs. Humans rarely write code that runs correctly or even complies the first time, and similarly the RNNs could improve the program iteratively.
I’d say the RNN is doing well to produce pretend code of this quality, as Antisuji says below.
Syntactically it’s quite a bit better than an N-gram markov chain: it gets indentation exactly right, it balances parentheses, braces, and comment start/end markers, delimits strings with quotation marks, and so on. You’re right that it’s no better than a markov chain at understanding the “code” it’s producing, at least at the level a human programmer does.