Any plans on evaluating RETRO (the retrieval augmented transformer from DeepMind) on TruthfulQA? I’m guessing it should perform similarly to WebGPT but would be nice to get a concrete number.
It would be interesting to evaluate RETRO as it works differently from all the models we’ve evaluated. WebGPT is finetuned to use a search engine and it uses this (at inference time) to answer questions. This seems more powerful than the retrieval system for RETRO (based on a simple nearest neighbor lookup). So my speculation is that WebGPT would do better.
We don’t have plans to evaluate it but are open to the possibility (if the RETRO team was interested).
Any plans on evaluating RETRO (the retrieval augmented transformer from DeepMind) on TruthfulQA? I’m guessing it should perform similarly to WebGPT but would be nice to get a concrete number.
It would be interesting to evaluate RETRO as it works differently from all the models we’ve evaluated. WebGPT is finetuned to use a search engine and it uses this (at inference time) to answer questions. This seems more powerful than the retrieval system for RETRO (based on a simple nearest neighbor lookup). So my speculation is that WebGPT would do better.
We don’t have plans to evaluate it but are open to the possibility (if the RETRO team was interested).