Anthony DiGiovanni

Karma: 855

Researcher at the Center on Long-Term Risk. All opinions my own.

Anthony DiGiovanni Apr 1, 2025, 3:41 AM
3 points
0
on: antimonyanthony’s Shortform
Isn’t the “you get what you measure” problem a problem for capabilities progress too, not just alignment? I.e.: Some tasks are sufficiently complex (hence hard to evaluate) and lacking in unambiguous ground-truth feedback that, when you turn the ML crank on them, you’re not necessarily going to select for actually doing the task well. You’ll select for “appearing to do the task well,” and it’s open question how well this correlates with actually doing the task well. (“Doing the task” here can include something much higher-level, like “being ‘generally intelligent’.”)

Which isn’t to say this problem wouldn’t bite especially hard for alignment. Alignment seems harder to verify than lots of things. But this is one reason I’m not fully sold that once you get human-level AI, capabilities progress will get faster.
(I’m hardly an expert on this, so might well have missed existing discourse on & answers to this question.)

Anthony DiGiovanni Mar 26, 2025, 5:20 AM
LW: 7 AF: 4
2
AF
in reply to: Daniel Kokotajlo’s comment on: METR: Measuring AI Ability to Complete Long Tasks
No, at some point you “jump all the way” to AGI
I’m confused as to what the actual argument for this is. It seems like you’ve just kinda asserted it. (I realize in some contexts all you can do is offer an “incredulous stare,” but this doesn’t seem like the kind of context where that suffices.)
I’m not sure if the argument is supposed to be the stuff you say in the next paragraph (if so, the “Also” is confusing).

Anthony DiGiovanni Mar 8, 2025, 6:37 AM
1 point
0
in reply to: tlevin’s comment on: tlevin’s Shortform
I worry there’s kind of a definitional drift going on here. I guess Holden doesn’t give a super clean definition in the post, but AFAICT these quotes get at the heart of the distinction:
Sequence thinking involves making a decision based on a single model of the world …

Cluster thinking – generally the more common kind of thinking – involves approaching a decision from multiple perspectives (which might also be called “mental models”), observing which decision would be implied by each perspective, and weighing the perspectives in order to arrive at a final decision. … [T]he different perspectives are combined by weighing their conclusions against each other, rather than by constructing a single unified model that tries to account for all available information.
“Making a decision based on a single model of the world” vs. “combining different perspectives by weighing their conclusions against each other” seems orthogonal to the failure mode you mention. (Which is a failure to account for a mechanism that the “cluster thinker” here explicitly foresees.) I’m not sure if you’re claiming that empirically, people who follow sequence thinking have a track record of this failure mode? If so, I guess I’m just suspicious of that claim and would expect it’s grounded mostly in vibes.

Anthony DiGiovanni Feb 25, 2025, 5:47 PM
1 point
2
in reply to: tlevin’s comment on: tlevin’s Shortform
here’s a story where we totally fail on that first thing and the second thing turns out to matter a ton!
I’m confused as to why this is inconsistent with sequence thinking. This sounds like identifying a mechanistic story for why the policy/technical win would have good consequences, and accounting for that mechanism in your model of the overall value of working on the policy/technical win. Which a sequence thinker can do just fine.

Anthony DiGiovanni Feb 13, 2025, 6:17 PM
5 points
3
in reply to: JBlack’s comment on: Notes on Occam via Solomonoff vs. hierarchical Bayes
working more directly with metrics such as “what are the most expected-value rewarding actions that a bounded agent can make given the evidence so far”
I’m not sure I exactly understand your argument, but it seems like this doesn’t avoid the problem of priors, because what’s the distribution w.r.t. which you define “expected-value rewarding”?

Anthony DiGiovanni Feb 2, 2025, 1:19 PM
1 point
0
in reply to: Lukas Finnveden’s comment on: Should you go with your best guess?: Against precise Bayesianism and related views
(General caveat that I’m not sure if I’m missing your point.)
Sure, there’s still a “problem” in the sense that we don’t have a clean epistemic theory of everything. The weights we put on the importance of different principles, and how well different credences fulfill them, will be fuzzy. But we’ve had this problem all along.
There are options other than (1) purely determinate credences or (2) implausibly wide indeterminate credences. To me, there are very compelling intuitions behind the view that the balance among my epistemic principles is best struck by (3) indeterminate credences that are narrow in proportion to the weight of evidence and how far principles like Occam seem to go. This isn’t objective (neither are any other principles of rationality less trivial than avoiding synchronic sure losses). Maybe your intuitions differ, upon careful reflection. That doesn’t mean it’s a free-for-all. Even if it is, this isn’t a positive argument for determinacy.
both do rely on my intuitions
My intuitions about foundational epistemic principles are just about what I philosophically endorse — in that domain, I don’t know what else we could possibly go on other than intuition. Whereas, my intuitions about empirical claims about the far future only seem worth endorsing as far as I have reasons to think they’re tracking empirical reality.

Anthony DiGiovanni Feb 2, 2025, 12:07 AM
3 points
0
in reply to: Lukas Finnveden’s comment on: Should you go with your best guess?: Against precise Bayesianism and related views
it seems pretty arbitrary to me where you draw the boundary between a credence that you include in your representor vs. not. (Like: What degree of justification is enough? We’ll always have the problem of induction to provide some degree of arbitrariness.)
To spell out how I’m thinking of credence-setting: Given some information, we apply different (vague) non-pragmatic principles we endorse — fit with evidence, Occam’s razor, deference, etc.
Epistemic arbitrariness means making choices in your credence-setting that add something beyond these principles. (Contrast this with mere “formalization arbitrariness”, the sort discussed in the part of the post about vagueness.)
I don’t think the problem of induction forces us to be epistemically arbitrary. Occam’s razor (perhaps an imprecise version!) favors priors that penalize a hypothesis like “the mechanisms that made the sun rise every day in the past suddenly change tomorrow”. This seems to give us grounds for having prior credences narrower than (0, 1), even if there’s some unavoidable formalization arbitrariness. (We can endorse the principle underlying Occam’s razor, “give more weight to hypotheses that posit fewer entities”, without a circular justification like “Occam’s razor worked well in the past”. Admittedly, I don’t feel super satisfied with / unconfused about Occam’s razor, but it’s not just an ad hoc thing.)
By contrast, pinning down a single determinate credence (in the cases discussed in this post) seems to require favoring epistemic weights for no reason. Or at best, a very weak reason that IMO is clearly outweighed by a principle of suspending judgment. So this seems more arbitrary to me than indeterminate credences, since it injects epistemic arbitrariness on top of formalization arbitrariness.

Anthony DiGiovanni Feb 1, 2025, 11:57 PM
5 points
1
in reply to: Lukas Finnveden’s comment on: Should you go with your best guess?: Against precise Bayesianism and related views
(I’ll reply to the point about arbitrariness in another comment.)
I think it’s generally helpful for conceptual clarity to analyze epistemics separately from ethics and decision theory. E.g., it’s not just EV maximization w.r.t. non-robust credences that I take issue with, it’s any decision rule built on top of non-robust credences. And I worry that without more careful justification, “[consequentialist] EV-maximizing within a more narrow “domain”, ignoring the effects outside of that “domain”″ is pretty unmotivated / just kinda looking under the streetlight. And how do you pick the domain?
(Depends on the details, though. If it turns out that EV-maximizing w.r.t. impartial consequentialism is always sensitive to non-robust credences (in your framing), I’m sympathetic to “EV-maximizing w.r.t. those you personally care about, subject to various deontological side constraints etc.” as a response. Because “those you personally care about” isn’t an arbitrary domain, it’s, well, those you personally care about. The moral motivation for focusing on that domain is qualitatively different from the motivation for impartial consequentialism.)
So I’m hesitant to endorse your formulation. But maybe for most practical purposes this isn’t a big deal, I’m not sure yet.

Anthony DiGiovanni Feb 1, 2025, 12:08 PM
1 point
0
in reply to: Lukas Finnveden’s comment on: Winning isn’t enough
That’s right.
(Not sure you’re claiming otherwise, but FWIW, I think this is fine — it’s true that there’s some computational cost to this step, but in this context we’re talking about the normative standard rather than what’s most pragmatic for bounded agents. And once we start talking about pragmatic challenges for bounded agents, I’d be pretty dubious that, e.g., “pick a very coarse-grained ‘best guess’ prior and very coarse-grained way of approximating Bayesian updating, and try to optimize given that” would be best according to the kinds of normative standards that favor indeterminate beliefs.)

Anthony DiGiovanni Jan 31, 2025, 9:40 AM
2 points
0
in reply to: Lukas Finnveden’s comment on: Winning isn’t enough
does that require you to either have the ability to commit to a plan or the inclination to consistently pick your plan from some prior epistemic perspective
You aren’t required to take an action (/start acting on a plan) that is worse from your current perspective than some alternative. Let maximality-dominated mean “w.r.t. each distribution in my representor, worse in expectation than some alternative.” (As opposed to “dominated” in the sense of “worse than an alternative with certainty”.) Then, in general you would need^[1] to ask, “Among the actions/plans that are not maximality-dominated from my current perspective, which of these are dominated from my prior perspective?” And rule those out.
1. ^
  If you care about diachronic norms of rationality, that is.

Anthony DiGiovanni Jan 29, 2025, 9:21 AM
3 points
1
in reply to: Noosphere89’s comment on: Should you go with your best guess?: Against precise Bayesianism and related views
mostly problems with logical omnisicence not being satisfied
I’m not sure, given the “Indeterminate priors” section. But assuming that’s true, what implication are you drawing from that? (The indeterminacy for us doesn’t go away just because we think logically omniscient agents wouldn’t have this indeterminacy.)
the arbitrariness of the prior is just a fact of life
The arbitrariness of a precise prior is a fact of life. This doesn’t imply we shouldn’t reduce this arbitrariness by having indeterminate priors.

Anthony DiGiovanni Jan 28, 2025, 4:13 PM
3 points
1
in reply to: Davidmanheim’s comment on: Should you go with your best guess?: Against precise Bayesianism and related views
The obvious answer is only when there is enough indeterminacy to matter; I’m not sure if anyone would disagree. Because the question isn’t whether there is indeterminacy, it’s how much, and whether it’s worth the costs of using a more complex model instead of doing it the Bayesian way.
Based on this I think you probably mean something different by “indeterminacy” than I do (and I’m not sure what you mean). Many people in this community explicitly disagree with the claim that our beliefs should be indeterminate at all, as exemplified by the objections I respond to in the post.
When you say “whether it’s worth the costs of using a more complex model instead of doing it the Bayesian way”, I don’t know what “costs” you mean, or what non-question-begging standard you’re using to judge whether “doing it the Bayesian way” would be better. As I write in the “Background” section: “And it’s question-begging to claim that certain beliefs “outperform” others, if we define performance as leading to behavior that maximizes expected utility under those beliefs. For example, it’s often claimed that we make “better decisions” with determinate beliefs. But on any way of making this claim precise (in context) that I’m aware of, “better decisions” presupposes determinate beliefs!”
You also didn’t quite endorse suspending judgement in that case—“If someone forced you to give a best guess one way or the other, you suppose you’d say “decrease”.
The quoted sentence is consistent with endorsing suspending judgment, epistemically speaking. As the key takeaways list says, “If you’d prefer to go with a given estimate as your “best guess” when forced to give a determinate answer, that doesn’t imply this estimate should be your actual belief.”

But if it is decision relevant, and there is only a binary choice available, your best guess matters
I address this in the “Practical hallmarks” section — what part of my argument there do you disagree with?

Anthony DiGiovanni Jan 28, 2025, 1:23 PM
1 point
0
in reply to: Davidmanheim’s comment on: Should you go with your best guess?: Against precise Bayesianism and related views
You seem to be underestimating how pervasive / universal this critique is—essentially every environment is more complex than we are
I agree it’s pretty pervasive, but the impression I’ve gotten from my (admittedly limited) sense of how infra-Bayesianism works is:
The “more complex than we are” condition for indeterminacy doesn’t tell us much about when, if ever, our credences ought to capture indeterminacy in how we weigh up considerations/evidence — which is a problem for us independent of non-realizability. For example, I’d be surprised if many/most infra-Bayesians would endorse suspending judgment in the motivating example in this post, if they haven’t yet considered the kinds of arguments I survey. This matters for how decision-relevant indeterminacy is for altruistic prioritization.
I’m also not aware of the infra-Bayesian literature addressing the “practical hallmarks” I discuss, though I might have missed something.
(The Solomonoff induction part is a bit above my pay grade, will think more about it.)

Anthony DiGiovanni Jan 19, 2025, 11:55 AM
1 point
0
on: Chance is in the Map, not the Territory
Just a pointer, I’d strongly recommend basically anything by Alan Hajek about this topic. “The reference class problem is your problem too” is a highlight. I find him to be an exceptionally clear thinker on philosophy of probability, and expect discussions about probability and beliefs would be less confused if more people read his work.

Anthony DiGiovanni Jan 5, 2025, 5:22 AM
2 points
−2
in reply to: green_leaf’s comment on: Computational functionalism probably can’t explain phenomenal consciousness
(1) isn’t a belief (unless accompanied by (2))
Why not? Call it what you like, but it has all the properties relevant to your argument, because your concern was that the person would “act in all ways as if they’re in pain” but not actually be in pain. (Seems like you’d be begging the question in favor of functionalism if you claimed that the first-person recognition ((2)-belief) necessarily occurs whenever there’s something playing the functional role of a (1)-belief.)
That’s not possible, because the belief_2 that one isn’t in pain has nowhere to be instantiated.
I’m saying that no belief_2 exists in this scenario (where there is no pain) at all. Not that the person has a belief_2 that they aren’t in pain.
Even if the intermediate stages believed_2 they’re not in pain and only spoke and acted that way (which isn’t possible), it would introduce a desynchronization between the consciousness on one side, and the behavior and cognitive processes on the other.
I don’t find this compelling, because denying epiphenomenalism doesn’t require us to think that changing the first-person aspect of X always changes the third-person aspect of some Y that X causally influences. Only that this sometimes can happen. If we artificially intervene on the person’s brain so as to replace X with something else designed to have the same third-person effects on Y as the original, it doesn’t follow that the new X has the same first-person aspect! The whole reason why given our actual brains our beliefs reliably track our subjective experiences is, the subjective experience is naturally coupled with some third-person aspect that tends to cause such beliefs. This no longer holds when we artificially intervene on the system as hypothesized.
There is no analogue of “fluid” in the brain. There is only the pattern.
We probably disagree at a more basic level then. I reject materialism. Subjective experiences are not just patterns.

Anthony DiGiovanni Jan 3, 2025, 4:28 PM
1 point
0
in reply to: ektimo’s comment on: What are your cruxes for imprecise probabilities / decision rules?
I think I’m happy to say that in this example, you’re warranted in reasoning like: “I have no information about the biases of the three coins except that they’re in the range [0.2, 0.7]. The space ‘possible biases of the coin’ seems like a privileged space with respect to which I can apply the principle of indifference, so there’s a positive motivation for having a determinate probability distribution about each of the three coins centered on 0.45.”
But many epistemic situations we face in the real world, especially when reasoning about the far future, are not like that. We don’t have a clear, privileged range of numbers to which we can apply the principle of indifference. Rather we have lots of vague guesses about a complicated web of things, and our reasons for thinking a given action could be good for the far future are qualitatively different from (hence not symmetric with) our reasons for thinking it could be bad. (Getting into the details of the case for this is better left for top-level posts I’m working on, but that’s the prima facie idea.)

Anthony DiGiovanni Dec 30, 2024, 5:25 PM
1 point
0
in reply to: ektimo’s comment on: What are your cruxes for imprecise probabilities / decision rules?
I’d think this isn’t a very good forecast since the forecaster should either have combined all their analysis into a single probability (say 30%) or else given the conditions under which they give their low end (say 10%) or high end (say 40%) and then if I didn’t have any opinions on the probability of those conditions then I would weigh the low and high equally (and get 25%).
This sounds like a critique of imprecise credences themselves, not maximality as a decision rule. Do you think that, even if the credences you actually endorse are imprecise, maximality is objectionable?
Anyway, to respond to the critique itself:
- The motivation for having an imprecise credence of [10%, 40%] in this case is that you might think a) there are some reasons to favor numbers closer to 40%; b) there are some reasons to favor numbers closer to 10%; and c) you don’t think these reasons have exactly equal weight, nor do you think the reasons in (a) have determinately more or less weight than those in (b). Given (c), it’s not clear what the motivation is for aggregating these numbers into 25% using equal weights.
- I’m not sure why exactly you think the forecaster “should” have combined their forecast into a single probability. In what sense are we losing information by not doing this? (Prima facie, it seems like the opposite: By compressing our representation of our information into one number, we’re losing the information “the balance of reasons in (a) and (b) seems indeterminate”.)

Anthony DiGiovanni Dec 30, 2024, 5:02 PM
2 points
0
in reply to: James Camacho’s comment on: Computational functionalism probably can’t explain phenomenal consciousness
In response to the two reactions:
1. Why do you say, “Besides, most people actually take the opposite approch: computation is the most “real” thing out there, and the universe—and any consciouses therein—arise from it.”
Euan McLean said at the top of his post he was assuming a materialist perspective. If you believe there exists “a map between the third-person properties of a physical system and whether or not it has phenomenal consciousness” you believe you can define consciousness with a computation. In fact, anytime you believe something can be explicitly defined and manipulated, you’ve invented a logic and computer. So, most people who take the materialist perspective believe the material world comes from a sort of “computational universe”, e.g. Tegmark IV.
I’m happy to grant that last sentence for the sake of argument, but note that you originally just said “most people,” full stop, without the massively important qualifier “who take the materialist perspective.”

Anthony DiGiovanni Dec 30, 2024, 4:36 AM
1 point
0
in reply to: green_leaf’s comment on: Computational functionalism probably can’t explain phenomenal consciousness
The non-functionalist audience is also not obliged to trust the introspective reports at intermediate stages.
This introduces a bizarre disconnect between your beliefs about your qualia, and the qualia themselves. Imagine: It would be possible, for example, that you believe you’re in pain, and act in all ways as if you’re in pain, but actually, you’re not in pain.
I think “belief” is overloaded here. We could distinguish two kinds of “believing you’re in pain” in this context:
1. Patterns in some algorithm (resulting from some noxious stimulus) that, combined with other dispositions, lead to the agent’s behavior, including uttering “I’m in pain.”
2. A first-person response of recognition of the subjective experience of pain.
I’d agree it’s totally bizarre (if not incoherent) for someone to (2)-believe they’re in pain yet be mistaken about that. But in order to resist the fading qualia argument along the quoted lines, I think we only need someone to (1)-believe they’re in pain yet be mistaken. Which doesn’t seem bizarre to me.
(And no, you don’t need to be an epiphenomenalist to buy this, I think. Quoting Block: “Consider two computationally identical computers, one that works via electronic mechanisms, the other that works via hydraulic mechanisms. (Suppose that the fluid in one does the same job that the electricity does in the other.) We are not entitled to infer from the causal efficacy of the fluid in the hydraulic machine that the electrical machine also has fluid. One could not conclude that the presence or absence of the fluid makes no difference, just because there is a functional equivalent that has no fluid.”)

Anthony DiGiovanni Dec 29, 2024, 8:08 PM
1 point
0
in reply to: Noosphere89’s comment on: antimonyanthony’s Shortform
the copies would not only have the same algorithm, but also the same physical structure arbitrarily finely
I understand, I’m just rejecting the premise that “same physical structure” implies identity to me. (Perhaps confusingly, despite the fact that I’m defending the “physicalist ontology” in the context of this thread (in contrast to algorithmic ontology), I reject physicalism in the metaphysics sense.)
This also seems tangential, though, because the substantive appeals to the algorithmic ontology that get made in the decision theory context aren’t about physically instantiated copies. They’re about non-physically-instantiated copies of your algorithm. I unfortunately don’t know of a reference for this off the top of my head, but it has come up in some personal communications FWIW.