Unrelated to vagueness they can also just change the framework again at any time.
cubefox
Reminds me of Schopenhauer’s posthumously published manuscript The Art of Being Right: 38 Ways to Win an Argument.
In Richard Jeffrey’s utility theory there is actually a very natural distinction between positive and negative motivations/desires. A plausible axiom is (the tautology has zero desirability: you already know it’s true). Which implies with the main axiom[1] that the negation of any proposition with positive utility has negative utility, and vice versa. Which is intuitive: If something is good, its negation is bad, and the other way round. In particular, if (indifference between and ), then .
More generally, . Which means that positive and negative utility of a proposition and it’s negation are scaled according to their relative odds. For example, while your lottery ticket winning the jackpot is obviously very good (large positive utility), having a losing ticket is clearly not very bad (small negative utility). Why? Because losing the lottery is very likely, far more likely than winning. Which means losing was already “priced in” to a large degree. If you learned that you indeed lost, that wouldn’t be a big update, so the “news value” is negative but not large in magnitude.
Which means this utility theory has a zero point. Utility functions are therefore not invariant under adding an arbitrary constant. So the theory actually allows you to say is “twice as good” as , “three times as bad”, “much better” etc. It’s a ratio scale.
- ↩︎
If and then
- ↩︎
conducive to well-being
That in itself isn’t a good definition , because it doesn’t distinguish ethics from, e.g. Medicine...and it doesn’t tell you whose well being. De facto people are ethically obliged to do things which against their well being and refrain from doing some things which promote their own wellbeing...I can’t rob people to pay my medical bills.
Promoting your own well-being only would be egoism, while ethics seems to be more similar to altruism.
Whose desires?
I guess of all beings that are conscious. Perhaps relative to their degree of consciousness. Though those are all questions which actual theories in normative ethics try to answer.
Why?
Not sure what this is asking for, but if it is “why is this analysis correct rather than another, or none?”—because of the meaning of the involved term. (Compare “why not count bushes as “trees” as well?”—“because that would be talking about something else”)
The various forms of theories in normative ethics (e.g. the numerous theories of utilitarianism, or Extrapolated Volition) can be viewed as attempts to analyze what terms like “ethics” or “good” mean exactly.
They could also be seen as attempts to find different denotations of a term with shared connotation.
This doesn’t reflect the actual methodology, where theories are judged in thought experiments on whether they satisfy our intuitive, pre-theoretical concepts. That’s the same as in other areas of philosophy where conceptual analysis is performed.
Related: zettelkasten, a different note-taking method, where each card gets an address.
Many attempts at establishing an objective morality try to argue from considerations of human well-being. OK, but who decided that human well-being is what is important? We did!
That’s a rather minimal amount of subjectivism. Everything downstream of that can be objective , so its really a compromise position
It’s also possible (and I think very probable) that “ethical” means something like “conducive to well-being”. Similar to how “tree” means something like “plant with a central wooden trunk”. Imagine someone objecting: “OK, but who decided that trees need to have a wooden trunk? We did!” That’s true in some weak sense (though nobody really “decided” that “tree” refers to trees), but it doesn’t mean it’s subjective whether or not trees have a wooden trunk.
Though I think the meaning of “ethical” is a bit different, as it doesn’t just take well-being into account but also desires. The various forms of theories in normative ethics (e.g. the numerous theories of utilitarianism, or Extrapolated Volition) can be viewed as attempts to analyze what terms like “ethics” or “good” mean exactly.
That’s some careful analysis!
Two remarks:
1
“Can” is the opposite of “unable”. “Unable” means that the change involves granting ability to they who would act, i.e. teaching a technique, providing a tool, fixing the body, or altering the environment.
That’s a good characterization, though arguably not a definition, as it relies on “ability”, which is circular. I can do something = I have the ability to do something. I can = I’m able to.
But we can use the initial principle (it really needs a name) which doesn’t mention ability:
You do a thing iff you can do it and you want to do it.
“Iff” behaves similar to an equation, so we can solve for “can”, similar to algebra in arithmetic. I don’t know the exact algebra of “iff”, but solving for “can” arguably yields:
“I can do X” iff “If I wanted to do X, I would do X”
Which uses wanting and a counterfactual to define “can”. We could also define:
“I want to do X” iff “If I could to do X, I would do X”
Though those two definitions together are circular. Maybe it is better to regard one concept as more basic than the other, and only define the less basic one in terms of the more basic one. It seems to me that “want” is more basic than “can”, so I would only define “can” in terms of “want”, and leave the definition for “want” open (for now).
2
Regarding the Confusing Cases. There are at least two canonical classes: akrasia (weakness of will) and addiction. “Can” the addict quit smoking? “Can” the person suffering from akrasia just do The Thing?
A better question perhaps: In which sense is the answer to the above “yes” and in which sense is it “no”?
One possibility is to analyze these cases in terms of first-order vs second-order desires. The first-order desire would be smoking or being lazy, the second order desire would be not to have the (first-order) desire to smoke, or be lazy. Second-order desires seem to be more important or rational than the first order desires. The second-order and first-order desire are “inconsistent” here. If the first-order desire to smoke is stronger than the second-order desire not to have the first-order desire, I don’t quit smoking. (Or don’t overcome my laziness in case of akrasia). Otherwise I do.
Here, the “can” definition as “If I wanted to do X, I would do X” is ambiguous. If it means “If I had the first-order desire to quit smoking, I would quit smoking”, then the sentence is true, and I “can” overcome the addiction (or the akrasia). If it means “If I had a second-order desire to not have the first order desire to smoke, I would quit smoking”, then the sentence is false, as I do in fact have the second-order desire but still don’t quit. So in this sense it’s not true that I “can” quit.
A similar but different different analysis wouldn’t phrase it in terms of first and second-order desires, but in terms of rational wishes and a-rational urges. I wish to quit smoking, but I have the urge to smoke. I wish to get to work, but I have the urge to be lazy. If we count, in the definition of “can”, urges as “wanting”, I can stop smoking, if we count “wishes” as “wanting”, I can’t. By a similar argument as above.
I think these cases are actually not a major problem of the “can” definition. After all, it seems in fact ambiguous to ask whether someone “can” overcome some case of akrasia or addiction. The definition captures that ambiguity.
Your headline overstates the results. The last common ancestor of birds an mammals probably wasn’t exactly unintelligent. (In contrast to our last common ancestor with the octopus, as the article discusses.)
“the” supposes there’s exactly one canonical choice for what object in the context is indicated by the predicate. When you say “the cat” there’s basically always a specific cat from context you’re talking about. “The cat is in the garden” is different from “There’s exactly one cat in the garden”.
Yes, we have a presupposition that there is exactly one cat. But that presupposition is the same regardless of the actual number of cats (regardless of the context), because the “context” here is a feature of the external world (“territory”), while the belief is a part of the “map”/world model/mind. So when we want to formalize the meaning of “The cat is in the garden”, that formalization has to be independent of the territory, that is, the same for any possible way the world is. So we can’t use individual constants. Because those can’t be used for cases where there is no cat or more than one. The mental content of a belief (the semantic content of a statement) is internal, so it doesn’t depend on what the external world is like.
I mean there has to be some possibility for revising your world model if you notice that there are actually 2 objects for something where you previously thought there’s only one.
The important part is that your world model doesn’t need to depend on what the world is like. If you believe that the cat is in the garden, that belief is the same independently of whether the presuppositions it makes are true. Therefore we cannot inject parts of the territory into the map. Or rather: there is no such injection, and if our formalization of beliefs (map/world model) assumes otherwise, that formalization is wrong.
Yeah I didn’t mean this as formal statement. formal would be: formal would be:
{exists x: apple(x) AND location(x, on=Table342)} CAUSES {exists x: apple(x) AND see(SelfPerson, x)}
Here you use two individual constants: Table342 and SelfPerson. Individual constants can only be used for direct reference, where unique reference can’t fail. So it can only be used for internal (mental) objects. So “SelfPerson” is okay, because you know a priori that you exist uniquly. If you didn’t have a body, you could still refer to yourself, and it’s not possible that you accidentally refer to more than one person, like a copy of you. You are part of your mind, your internal state. But “Table342″ is an external object. It might not exist, or multiple such tables might exist even though you presupposed it was only one. “Table342” is an individual constant, which are incompatible with presupposition failure. So it can’t be used. That formalization is incompatible with possible worlds where the table doesn’t exist uniquely. But you have the same belief whether or not your presupposition is satisfied. So the formalization is faulty. We have to use one where no constants are used for reference to external things like tables.
What I was saying was that we can, from our subjective perspective, only “point” to or “refer” to objects in a certain way. In terms of predicate logic the two ways of referring are via a) individual constants and b) variable quantification. The first corresponds to direct reference, where the reference always points to exactly one object. Mental objects can presumably be referred to directly. For other objects, like physical ones, quantifiers have to be used. Like “at least one” or “the” (the latter only presupposes there is exactly one object satisfying some predicate). E.g. “the cat in the garden”. Perhaps there is no cat in the garden or there are several. So it (the cat) cannot be logically represented with a constant. “I” can be, but “this” again cannot. Even ordinary proper names of people cannot, because they aren’t guaranteed to refer to exactly one object. Maybe “Superman” is actually two people with the same dress, or he doesn’t exist, being the result of a hallucination. This case can be easily solved by treating those names as predicates. Compare:
The woman believes the superhero can fly.
The superhero is the colleague.
The above only has quantifiers and predicates, no constants. The original can be handled analogously:
(The) Mia believes (the) Superman can fly.
(The) Superman is (the) Clark Kent.
The names are also logical predicates here. In English you wouldn’t pronounce the definitive articles for the proper nouns here, but in other languages you would.
Indicators like “here”/”tomorrow”/”the object I’m pointing to” don’t get stored directly in beliefs. They are pointers used for efficiently identifying some location/time/object from context, but what get’s saved in the world model is the statement where those pointers were substituted for the referent they were pointing to.
As I argued above, “pointing” (referring) is a matter of logic, so I would say assuming the existence of separate “pointers” is mistake.
You can say “{(the fact that) there’s an apple on the table} causes {(the fact that) I see an apple}”
But that’s not primitive in terms of predicate logic, because here “the” in “the table” means “this” which is not a primitive constant. You don’t mean any table in the world, but a specific one, which you can identify in the way I explained in my previous comment. I don’t know how it would work with fact causation rather than objects, though there might be an appropriate logical analysis.
I think object identification is important if we want to analyze beliefs instead of sentences. For beliefs we can’t take a third person perspective and say “it’s clear from context what is meant”. Only the agent knows what he means when he has a belief (or she). So the agent has to have a subjective ability to identify things. For “I” this is unproblematic, because the agent is presumably internal and accessible to himself and therefore can be subjectively referred to directly. But for “this” (and arguably also for terms like “tomorrow”) the referred object depends partly on facts external to the agent. Those external facts might be different even if the internal state of the agent is the same. For example, “this” might not exist, so it can’t be a primitive term (constant) in standard predicate logic.
One approach would be to analyze the belief that this apple is green as “There is an x such that x is an apple and x is green and x causes e.” Here “e” is a primitive term (similar to “I” in “I’m hungry”) that refers to the current visual experience of a green apple.
So e is subjective experience and therefore internal to the agent. So it can be directly referred to, while this (the green apple he is seeing) is only indirectly referred to (as explained above), similar to “the biggest tree”, “the prime minister of Japan”, “the contents of this box”.
Note the important role of the term “causes” here. The belief is representing a hypothetical physical object (the green apple) causing an internal object (the experience of a green apple). Though maybe it would be better to use “because” (which relates propositions) instead of “causes”, which relates objects or at least noun phrases. But I’m not sure how this would be formalized.
Yeah. I proposed a while ago that all the AI content was becoming so dominant that it should be hived off to the Alignment Forum while LessWrong is for all the rest. This was rejected.
Maybe I missed it, but what about indexical terms like “I”, “this”, “now”?
There is still the possibility on the front page to filter out the AI tag completely.
That difference is rather extreme. It seems LLM companies have a strong winner-take-all market tendency. Similar to Google (web search) or Amazon (online retail) in the past. It seems now much more likely to me that ChatGPT has basically already won the LLM race, similar to how Google won the search engine race in the past. Gemini outperforming ChatGPT in a few benchmarks likely won’t make a difference.
[...] because it is embedded natively, deep in the architecture of our omnimodal GPT‑4o model, 4o image generation can use everything it knows to apply these capabilities in subtle and expressive ways [...] Unlike DALL·E, which operates as a diffusion model, 4o image generation is an autoregressive model natively embedded within ChatGPT.
To operationalise this: a decision theory usually assumes that you have some number of options, each with some defined payout. Assuming payouts are fixed, all decision theories simply advise you to pick the outcome with the highest utility.
The theories typically assume that each choice option has a number of known mutually exclusive (and jointly exhaustive) possible outcomes. And to each outcome the agent assigns a utility and a probability. So uncertainty is in fact modelled insofar the agent can assign subjective probabilities to those outcomes occurring. The expected utility of an outcome is then something like its probability times its utility.
Other uncertainties are not covered in decision theory. E.g. 1) if you are uncertain what outcomes are possible in the first place, 2) if you are uncertain what utility to assign to a possible outcome, 3) if you are uncertain what probability to assign to a possible outcome.
I assume you are talking about some of the latter uncertainties?
(This is off-topic but I’m not keen on calling LLMs “he” or “she”. Grok is not a man, nor a woman. We shouldn’t anthropomorphize language models. We already have an appropriate pronoun for those: “it”)
There is also Deliberation in Latent Space via Differentiable Cache Augmentation by Liu et al. and Efficient Reasoning with Hidden Thinking by Shen et al.
I think when people use the term “gradual disempowerment” predominantly in one sense, people will also tend to understand it in that sense. And I think that sense will be rather literal and not the one specifically of the original authors. Compare the term “infohazard” which is used differently (see comments here) from how Yudkowsky was using it.