Ramana Kumar comments on Oversight Misses 100% of Thoughts The AI Does Not Think

Ramana Kumar 15 Aug 2022 9:49 UTC
LW: 4 AF: 4
3
AF
The AI wasn’t trained to translate the literal semantics of questions into a query to its own internal world model and then translate the result back to human language; humans have no clue how to train such a thing.
This sounds pretty close to what ELK is for. And I do expect if there is a solution found for ELK for people to actually use it. Do you? (We can argue separately about whether a solution is likely to be found.)
- johnswentworth 15 Aug 2022 17:19 UTC
  LW: 3 AF: 3
  1
  AF Parent
  Indeed, ELK is very much asking the right questions, and I do expect people would use it if a robust and reasonably-performant solution were found. (Alignment is in fact economically valuable; it would be worth a lot.)