Ankesh Anand comments on How do new models from OpenAI, DeepMind and Anthropic perform on TruthfulQA?

Ankesh Anand 27 Feb 2022 10:40 UTC
3 points
Any plans on evaluating RETRO (the retrieval augmented transformer from DeepMind) on TruthfulQA? I’m guessing it should perform similarly to WebGPT but would be nice to get a concrete number.
- Owain_Evans 27 Feb 2022 18:51 UTC
  4 points
  Parent
  It would be interesting to evaluate RETRO as it works differently from all the models we’ve evaluated. WebGPT is finetuned to use a search engine and it uses this (at inference time) to answer questions. This seems more powerful than the retrieval system for RETRO (based on a simple nearest neighbor lookup). So my speculation is that WebGPT would do better.
  
  We don’t have plans to evaluate it but are open to the possibility (if the RETRO team was interested).