quila comments on quila’s Shortform

quila 2 Jun 2024 14:40 UTC
11 points
0
edit: i think i’ve received enough expressions of interest (more would have diminishing value but you’re still welcome to), thanks everyone!
i recall reading in one of the MIRI posts that Eliezer believed a ‘world model violation’ would be needed for success to be likely.
i believe i may be in possession of such a model violation and am working to formalize it, where by formalize i mean write in a way that is not ‘hard-to-understand intuitions’ but ‘very clear text that leaves little possibility for disagreement once understood’. it wouldn’t solve the problem, but i think it would make it simpler so that maybe the community could solve it.
if you’d be interested in providing feedback on such a ‘clearly written version’, please let me know as a comment or message.^[1] (you’re not committing to anything by doing so, rather just saying “im a kind of person who would be interested in this if your claim is true”). to me, the ideal feedback is from someone who can look at the idea under ‘hard’ assumptions (of the type MIRI has) about the difficulty of pointing an ASI, and see if the idea seems promising (or ‘like a relevant model violation’) from that perspective.
1. ^
  i don’t have many contacts in the alignment community
- Seth Herd 2 Jun 2024 15:31 UTC
  7 points
  0
  Parent
  I’m game! We should be looking for new ideas, so I’m happy to look at yours and provide feedback.
- quetzal_rainbow 2 Jun 2024 17:41 UTC
  5 points
  0
  Parent
  Consider me in
- Garrett Baker 2 Jun 2024 16:49 UTC
  5 points
  0
  Parent
  Historically I’ve been able to understand others’ vague ideas & use them in ways they endorse. I can’t promise I’ll read what you send me, but I am interested.
- Joel Burget 2 Jun 2024 21:27 UTC
  2 points
  0
  Parent
  Maybe you can say a bit about what background someone should have to be able to evaluate your idea.