Idea for experiment: take a set of coding problems which have at least two solutions, say, recursive and non-recursive. Prompt LLM to solve them. Is it possible to predict which solution LLM will generate from activations due to first token generation?
If it is possible, it is the evidence against “weak forward pass”.
Idea for experiment: take a set of coding problems which have at least two solutions, say, recursive and non-recursive. Prompt LLM to solve them. Is it possible to predict which solution LLM will generate from activations due to first token generation?
If it is possible, it is the evidence against “weak forward pass”.
(I am genuinely curious about reasons behind downvotes)