tailcalled
The assumption of virtue ethics isn’t that virtue is unknown and must be discovered—it’s that it’s known and must be pursued.
If it is known, then why do you not ever answer my queries about providing an explicit algorithm for converting intelligence into virtuous agency, instead running in circles about how There Must Be A Utility Function!?
If the virtuous action, as you posit, is to consume ice cream, intelligence would allow an agent to acquire more ice cream, eat more over time by not making themselves sick, etc.
I’m not disagreeing with this, I’m saying that if you apply the arguments which show that you can fit a utility function to any policy to the policies that turn down some ice cream, then as you increase intelligence and that increases the pursuit of ice cream, the resulting policies will score lower on the utility function which values turning down ice cream.
But any such decision algorithm, for a virtue ethicist, is routing through continued re-evaluation of whether the acts are virtuous, in the current context, not embracing some farcical LDT version of needing to pursue ice cream at all costs. Your assumption, which is evidently that the entire thing turns into a compressed and decontextualized utility function (“algorithm”) is ignoring the entire hypothetical.
You’re the one who said that virtue ethics implies a utility function! I didn’t say anything about it being compressed and decontextualized, except as a hypothetical example of what virtue ethics is because you refused to provide an implementation of virtue ethics and instead require abstracting over it.
I’m not interested in continuing this conversation until you stop strawmanning me.
No, that’s not my argument.
Let’s imagine that True Virtue is seeking and eating ice cream, but that you don’t know what true virtue is for some reason.
Now let’s imagine that we have some algorithm for turning intelligence into virtuous agency. (This is not an assumption that I’m willing to grant (since you haven’t given something like argmax for virtue), and really that’s the biggest issue with my proposal, but let’s entertain it to see my point.)
If the algorithm is run on the basis of some implementation of intelligence that is not good enough, then the resulting agent might turn down some opportunities to get ice cream, by mistake, and instead do something else, such as pursue money (but less money than you could get the ice cream for). As a result of this, you would conclude that pursuing ice cream is not virtuous, or at least, not as virtuous as pursuing money.
If you then turn up the level of intelligence, the resulting agent would pursue ice cream in this situation where it previously pursued virtue. However, this would make it score worse on your inferred utility function where pursuing money is more virtuous than pursuing intelligence.
Now of course you could say that your conclusion that pursuing ice cream is less virtuous than pursuing money is wrong. But then you can only say that if you grant that you cannot infer a virtue-ethical utility function from a virtue-ethical policy, as this utility function was inferred from the policy.
I didn’t say you need to understand what an argument is, I said you need to understand your own argument.
It is true that if the utility functions cover a sufficiently broad set of possibilities, any “reasonable” policy (for a controversial definition of “reasonable”) maximizes a utility function, and if the utility functions cover an even broader set of possibilities, literally any policy maximizes a utility function.
But, if you want to reference these facts, you should know why they are true. For instance, here’s a rough sketch of a method for finding a utility function for the first statement:
If you ask a reasonable policy to pick between two options, it shouldn’t have circular preferences, so you should be able to offer it different options and follow the preferred one until you find the absolute best scenario according to the policy. Similarly, you should be able to follow the dispreferred one until you find the absolute worst scenario according to the policy. Then you can define the utility of any outcome based on the probability mixture of the best and worst scenario where the policy switches between preferring the outcome vs preferring the probability mixture.
Now let’s say there’s an option where e.g. you’re not smart enough to realize that option gives you ice cream. Then you won’t be counting the ice cream when you decide at what threshold you prefer that option to the mixture. But then that means the induced utility function won’t include the preference for ice cream.
I’m showing that the assumptions necessary for your argument don’t hold, so you need to better understand your own argument.
The methods for converting policies to utility functions assume no systematic errors, which doesn’t seem compatible with varying the intelligence levels.
This.
In particular imagine if the state space of the MDP factors into three variables x, y and z, and the agent has a bunch of actions with complicated influence on x, y and z but also just some actions that override y directly with a given value.
In some such MDPs, you might want a policy that does nothing other than copy a specific function of x to y. This policy could easily be seen as a virtue, e.g. if x is some type of event and y is some logging or broadcasting input, then it would be a sort of information-sharing virtue.
While there are certain circumstances where consequentialism can specify this virtue, it’s quite difficult to do in general. (E.g. you can’t just minimize the difference between f(x) and y because then it might manipulate x instead of y.)
I didn’t claim virtue ethics says not to predict consequences of actions. I said that a virtue is more like a procedure than it is like a utility function. A procedure can include a subroutine predicting the consequences of actions and it doesn’t become any more of a utility function by that.
The notion that “intelligence is channeled differently” under virtue ethics requires some sort of rule, like the consequentialist argmax or Bayes, for converting intelligence into ways of choosing.
Consequentialism is an approach for converting intelligence (the ability to make use of symmetries to e.g. generalize information from one context into predictions in another context or to e.g. search through highly structured search spaces) into agency, as one can use the intelligence to predict the consequences of actions and find a policy which achieves some criterion unusually well.
While it seems intuitively appealing that non-consequentialist approaches could be used to convert intelligence into agency, I have tried a lot and not been able to come up with anything convincing. For virtues in particular, I would intuitively think that a virtue is not a motivator per se, but rather the policy generated by the motivator. So I think virtue-driven AI agency just reduces to ordinary programming/GOFAI, and that there’s no general virtue-ethical algorithm to convert intelligence into agency.
The most straightforward approach to programming a loyal friend would be to let the structure of the program mirror the structure[1] of the loyal friendship. That is, you would think of some situation that a loyal friend might encounter, and write some code that detects and handles this situation. Having a program whose internal structure mirrors its external behavior avoids instrumental convergence (or any kind of convergence) because each behavior is specified separately and one can make arbitrary exceptions as one sees fit. However, it also means that the development and maintenance burden scales directly with how many situations the program generalizes to.
- ^
This is the “standard” way to write programs—e.g. if you make a SaaS app, you often have template files with a fairly 1:1 correspondence to the user interface, database columns with a 1:many correspondence to the user interface fields, etc.. By contrast, a chess bot that does a tree search does not have a 1:1 correspondence between the code and the plays; for instance the piece value table does not clearly affect it’s behavior in any one situation, but obviously kinda affects its behavior in almost all situations. (I don’t think consequentialism is the only way for the structure of a program to not mirror the structure of its behavior, but it’s the most obvious way.)
- ^
Not sure what you mean. Are you doing a definitional dispute about what counts as the “standard” definition of Bayesian networks?
Your linked paper is kind of long—is there a single part of it that summarizes the scoring so I don’t have to read all of it?
Either way, yes, it does seem plausible that one could create a market structure that supports latent variables without rewarding people in the way I described it.
I’m not convinced Scott Alexander’s mistakes page accurately tracks his mistakes. E.g. the mistake on it I know the most about is this one:
56: (5/27/23) In Raise Your Threshold For Accusing People Of Faking Bisexuality, I cited a study finding that most men’s genital arousal tracked their stated sexual orientation (ie straight men were aroused by women, gay men were aroused by men, bi men were aroused by either), but women’s genital arousal seemed to follow a bisexual pattern regardless of what orientation they thought they were—and concluded that although men’s orientation seemed hard-coded, women’s orientation must be more psychological. But Ozy cites a followup study showing that women (though not men) also show genital arousal in response to chimps having sex, suggesting women’s genital arousal doesn’t track actual attraction and is just some sort of mechanical process triggered by sexual stimuli. I should not have interpreted the results of genital arousal studies as necessarily implying attraction.
But that’s basically wrong. The study found women’s arousal to chimps having sex to be very close to their arousal to nonsexual stimuli, and far below their arousal to sexual stimuli.
I mean I don’t really believe the premises of the question. But I took “Even if you’re not a fan of automating alignment, if we do make it to that point we might as well give it a shot!” to imply that even in such a circumstance, you still want me to come up with some sort of answer.
Life on earth started 3.5 billion years ago. Log_2(3.5 billion years/1 hour) = 45 doublings. With one doubling every 7 months, that makes 26 years, or in 2051.
(Obviously this model underestimates the difficulty of getting superalignment to work. But also extrapolating the METR trend is questionable for 45 doublings is dubious in an unknown direction. So whatever.)
I talk to geneticists (mostly on Twitter, or rather now BlueSky) and they don’t really know about this stuff.
(Presumably there exists some standard text about this that one can just link to lol.)
I don’t think so.
I’m still curious whether this actually happens.… I guess you can have the “propensity” be near its ceiling.… (I thought that didn’t make sense, but I guess you sometimes have the probability of disease for a near-ceiling propensity be some number like 20% rather than 100%?) I guess intuitively it seems a bit weird for a disease to have disjunctive causes like this, but then be able to max out at the risk at 20% with just one of the disjunctive causes? IDK. Likewise personality...
For something like divorce, you could imagine the following causes:
Most common cause is you married someone who just sucks
… but maybe you married a closeted gay person
… or maybe your partner was good but then got cancer and you decided to abandon them rather than support them through the treatment
The genetic propensities for these three things are probably pretty different: If you’ve married someone who just sucks, then a counterfactually higher genetic propensity to marry people who suck might counterfactually lead to having married someone who sucks more, but a counterfactually higher genetic propensity to marry a closeted gay person probably wouldn’t lead to counterfactually having married someone who sucks more, nor have much counterfactual effect on them being gay (because it’s probably a nonlinear thing), so only the genetic propensity to marry someone who sucks matters.
In fact, probably the genetic propensity to marry someone who sucks is inversely related to the genetic propensity to divorce someone who encounters hardship, so the final cause of divorce is probably even more distinct from the first one.
Ok, more specifically, the decrease in the narrowsense heritability gets “double-counted” (after you’ve computed the reduced coefficients, those coefficients also get applied to those who are low in the first chunk and not just those who are high, when you start making predictions), whereas the decrease in the broadsense heritability is only single-counted. Since the single-counting represents a genuine reduction while the double-counting represents a bias, it only really makes sense to think of the double-counting as pathological.
It would decrease the narrowsense (or additive) heritability, which you can basically think of as the squared length of your coefficient vector, but it wouldn’t decrease the broadsense heritability, which is basically the phenotypic variance in expected trait levels you’d get by shuffling around the genotypes. The missing heritability problem is that when we measure these two heritabilities, the former heritability is lower than the latter.
If some amount of heritability is from the second chunk, then to that extent, there’s a bunch of pairs of people whose trait differences are explained by second chunk differences. If you made a PGS, you’d see these pairs of people and then you’d find out how specifically the second chunk affects the trait.
This only applies if the people are low in the first chunk and differ in the second chunk. Among the people who are high in the first chunk but differ in the second chunk, the logarithm of their trait level will be basically the same regardless of the second chunk (because the logarithm suppresses things by the total), so these people will reduce the PGS coefficients rather than increasing the PGS coefficients. When you create the PGS, you include both groups, so the PGS coefficients will be downwards biased relative to .
My current best guess is that:
Like for most other concepts, we don’t have rigorous statistics and measurements showing that there is a natural clustering of autism symptoms, (there are some non-rigorous ones though)
When various schools of psychotherapy, psychiatry and pediatrics sorted children with behavioral issues together, they often ended up with an autistic group,
Each school has their own diagnosis on what exactly is wrong in the case of autism, and presumably they aren’t all correct about all autistic people, so to know the True Reason autism is “a thing”, you’d first have to figure out which school is correct in its analysis of autism,
“Autism” as a concept exists because the different schools mostly agreed that the kids in question had a similar pathology, even if they disagreed on what the pathology is.