When I read this post I feel like I’m seeing four different strands bundled together:
1. Truth-of-beliefs as fuzzy or not
2. Models versus propositions
3. Bayesianism as not providing an account of how you generate new hypotheses/models
4. How people can (fail to) communicate with each other
I think you hit the nail on the head with (2) and am mostly sold on (4), but am sceptical of (1) - similar to what several others have said, it seems to me like these problems don’t appear when your beliefs are about expected observations, and only appear when you start to invoke categories that you can’t ground as clusters in a hierarchical model.
That leaves me with mixed feelings about (3):
- It definitely seems true and significant that you can get into a mess by communicating specific predictions relative to your own categories/definitions/contexts without making those sufficiently precise
- I am inclined to agree that this is a particularly important feature of why talking about AI/x-risk is hard
- It’s not obvious to me that what you’ve said above actually justifies knightian uncertainty (as opposed to infrabayesianism or something), or the claim that you can’t be confident about superintelligence (although it might be true for other reasons)
Could you expand on what you mean by ‘less automation’? I’m taking it to mean some combination of ‘bounding the space of controller actions more’, ‘automating fewer levels of optimisation’, ‘more of the work done by humans’ and maybe ‘only automating easier tasks’ but I can’t quite tell which of these you’re intending or how they fit together.
(Also, am I correctly reading an implicit assumption here that any attempts to do automated research would be classed as ‘automated ai safety’?)