I think OP just wanted some declarative code (I don’t think Python is the ideal choice of language, but basically anything that’s not a Turing tarpit is fine) that could speak fairly coherent English. I suspect if you had a functional transformer decompiler the results aof appling it to a Tiny Stories-size model are going to be tens to hundreds of megabytes of spaghetti, so understanding that in detail is going to be huge slog, but on the other hand, this is an actual operationalization of the Chinese Room argument (or in this case, English Room)! I agree it would be fascinating, if we can get a significant fraction of the model’s perplexity score. If it is, as people seem to suspect, mostly or entirely a pile of spaghetti, understanding even a representative (frequency-of-importance biased) statistical sample of it (say, enough for generating a few specific sentences) would still be fascinating.
I think OP just wanted some declarative code (I don’t think Python is the ideal choice of language, but basically anything that’s not a Turing tarpit is fine) that could speak fairly coherent English. I suspect if you had a functional transformer decompiler the results aof appling it to a Tiny Stories-size model are going to be tens to hundreds of megabytes of spaghetti, so understanding that in detail is going to be huge slog, but on the other hand, this is an actual operationalization of the Chinese Room argument (or in this case, English Room)! I agree it would be fascinating, if we can get a significant fraction of the model’s perplexity score. If it is, as people seem to suspect, mostly or entirely a pile of spaghetti, understanding even a representative (frequency-of-importance biased) statistical sample of it (say, enough for generating a few specific sentences) would still be fascinating.