Philosopher working in Melbourne, Australia. My book Meaning and Metaphysical Necessity is forthcoming in June 2022 with Routledge.
tristanhaze
Would be good to see some more references and discussion of illusionism as a view in its own right. For my money the recent work of Wolfgang Schwarz on imaginary foundations and sensor variables gives a powerful explanation of why we might have this illusion.
I’d be interested to hear how this compares with Wolfgang Schwarz’s ideas in ‘Imaginary Foundations’ and ‘From Sensor Variables to Phenomenal Facts’. Sounds like there’s some overlap, and Schwarz has a kind of explanation for why the hard problem might arise that you might be able to draw on.
Link to the second of the papers mentioned: https://www.umsu.de/papers/sensorfacts.pdf
Very interesting. I’m stuck on the argument about truthfulness being hard because the concept of truth is somehow fraught or too complicated. I’m envisaging an objection based on the T-schema (‘<p> is true iff p’).
Nate writes:
Now, in real life, building a truthful AGI is much harder than building a diamond optimizer, because ‘truth’ is a concept that’s much more fraught than ‘diamond’. (To see this, observe that the definition of “truth” routes through tricky concepts like “ways the AI communicated with the operators” and “the mental state of the operators”, and involves grappling with tricky questions like “what ways of translating the AI’s foreign concepts into human concepts count as manipulative?” and “what can be honestly elided?”, and so on, whereas diamond is just carbon atoms bound covalently in tetrahedral lattices.)
(end of quote)
But this reference to “the definition of ‘truth’” seems to presuppose some kind of view, where I’m not sure what that view is, but know it’s definitely going to be philosophically controversial.
Some think that ‘true’ can be defined by taking all the instances of the T-schema, or a (perhaps restricted) universal generalisation of it.
And this seems not totally crazy or irrelevant from an AI design perspective, at least at first blush. I feel I can sort of imagine an AI obeying a rule which says to assert <p> only if p.
Trying to envisage problems and responses, I hit the idea that the AI would have degrees of belief or credences, and not simply a list of things it thinks are true simpliciter. But perhaps it can have both. And perhaps obeying the T-schema based truthfulness rule would just lead it to confine most of its statements to statements about its own credences or something like that.
I think I see a separate problem about ensuring the AI does not (modify itself in order to) violate the T-schema based truthfulness rule. But that seems different at least from the supposed problem in the OP about the definition of ‘true’ being fraught or complicated or something.
If it wasn’t already clear I’m a philosophy person, not an alignment expert, but I follow alignment with some interest.
This is an instance of arc that clever people have been going through for ages, so I’d like to see more teasing apart of the broader phenomenon from the particular historical episode of the Sequences etc.
A lot of the mixed feelings and lack of identification as rationalists on the part of lots of people who found the Sequences interesting reading is to be explained in terms of their perceiving the vibe you describe and being aware of its pitfalls.
Interesting to read, here are a couple of comments on parts of what you say:
>the claim that all possibilities exist (ie. that counterfactuals are ontologically real)
‘counterfactuals are ontologically real’ seems like a bad way of re-expressing ‘all possibilities exist’. Counterfactuals themselves are sentences or propositions, and even people who think there’s e.g. no fact of the matter with many counterfactuals should agree that they themselves are real.
Secondly, most philosophers who would be comfortable with talking seriously about possibilities or possible worlds as real things would not go along with Lewis in holding them to be concrete. The view that possibilities really truly exist is quite mainstream and doesn’t commit you to modal realism.
>what worlds should we conceive of as being possible? Again, we can make this concrete by asking what would >happen if we were to choose a crazy set of possible worlds—say a world just like this one and then a world with >unicorns and fountains of gold—and no other worlds
I think it’s crucial to note that it’s not the presence of the unicorns world that makes trouble here, it’s the absence of all the other ones here. So what you’re gesturing at here is I think the need for a kind of plenitude in the possibilities one believes in.
Ramsey could be on the list too but I guess his tragically short life makes it hard to do some of the cells.
Maybe a bit off-colour to call the fact that three of Wittgenstein’s brothers committed suicide ‘delicious’...
Wittgenstein had so many ideas and is such a difficult thinker that I think one ought to read him before secondary sources. Also he’s a wonderful writer.
I think there’s a potentially confusing fact which you’re neglecting in this post, namely the reality of literature as territory not map. If you’re interested in literature, then when you read it you get lots of knowledge of what e.g. certain books contain, what certain authors wrote, and that can be very instructive not just within literature. I’d like to see you and others with this kind of viewpoint wrestle more with this kind of consideration.
Mill?! When are you from, John David Galt?!
I just want to say that the title of this post is fantastic, and in a deep sort of mathy way, beautiful. It’s probably usually not possible, but I love it when an appropriate title—especially a nice not-too-long one—manages to contain, by itself, so much intellectual interest. Even just seeing that title listed somewhere could plant an important seed in someone’s mind.
I just want to say that the title of this post is fantastic, and in a deep sort of mathy way, beautiful. It’s probably usually not possible, but I love it when an appropriate title—especially a nice not-too-long one—manages to contain, by itself, so much intellectual interest. Even just seeing that title listed somewhere could plant an important seed in someone’s mind.
I don’t see any reason to think he’s trying to convey that scientists in general, or good ones, or anything like that, believe in fake reductionism. Some people do, and it’s more charitable to Keats to presume he was just alluding to them.
I agree with Robin that that indeed seems the weak point. It is far from clear to me, and I suspect it is not the case, that Keats here is doing something along the lines of actually trying to convey that, oh, there’s nothing special about rainbows, science has explained them, or whatever. Rather, he’s invoking and playing with that sort of trope, for a sophisticated poetic purpose.
I think the main point or points of Eliezer’s post here are sound, but even suggesting that that sort of thing could be pinned on Keats is a needless distraction. Obviously serious poetry isn’t Eliezer’s strong point, as I’m sure he’d be the first to agree. The introductory quote could still be used to good effect though.
I think you’re probably right about this (not based on first-hand experience of having a child, mind—I haven’t), but I can’t quite see what it’s doing here. Is this meant to be some sort of objection to the comment you’re replying to? It isn’t obviously in tension with it.
Yep, what The Ancient Geek said. Sorry I didn’t reply in a timely way—I’m not a regular user. I’m glad you basically agree, and pardon me for using such a recherche word (did I just do it again?) needlessly. Philosophical training can do that to you; you get a bit blind to how certain words are, while they could be part of the general intellectual culture, actually only used in very specific circles. (I think ‘precisification’ is another example of this. I used it with an intelligent nerd friend recently and, while of course he understood it—it’s self explanatory—he thought it was terrible, and probably thought I just made it up.)
Hope you look at Wittgenstein!
Filled in. This is a good idea. I would be interested in getting some feedback on the feedback, or seeing a writeup of some of the lessons or issues that come out of this.
How specifically could being “definite” be a a problem for language? Take any specific thing, apply an arbitrary label, and you are done.
This remark seems to flow from an oversimplified view of how language works. In the context of, for example, a person or a chair, this paradigm seems pretty solid… at least, it gets you a lot. You can ostend the thing (‘take’ it, as it were) and then appy the label. But in the case of lots of “objects” there is nothing analogous to such ‘taking’ as a prior, discrete step from talking. For example, “objects” like happiness, or vagueness or definiteness themselves.
I think you may benefit from reading Wittgenstein, but maybe you’d just hate it. I think you need it though!
For my part, I’ve found the economic notions of opportunity cost and marginal utility to be like this.
‘my writing is more enthusiastic than the evidence would call for, but alas I must excite my readers and get the pageviews’
For my money, that’s just contemptible. And there’s no ‘must’ about it: you can, and probably should, stop doing that, even if it means you get less pageviews.
Why is it OK to use deduction theorem, though? In standard modal logics like K and S5 the deduction theorem doesn’t hold (otherwise you could assume P, use necessitation to get []P, and then use deduction theorem to get P → []P as a theorem).