Is that a critique of expected utility maximization in general, or are you saying that concave functions of wealth aren’t risk-averse enough?
SilentCal
Why would maximizing expectation on a concave utility function lead to losing your shirt? It seems like any course of action that predictably leads to losing your shirt is self-evidently not maximizing expected concave-utility-function, unless it’s a Pascal mugging type scenario. I don’t think there are credible Pascal muggings in the world of personal finance, and if there are I’d be willing to accept an ad hoc axiom that we limit our theory to more conventional investments.
Now, I’ll admit it’s possible we should have a loss averse utility function, but we can do that without abandoning the mathematical approach—just add a time derivative of wealth, or something.
Has anyone developed a quantitative theory of personal finance in the following sense?
Most money advice falls back on rules of thumb; I’m looking for an approach that’s made-up numbers all the way down.
The main idea would be to express utility as a function of financial quantities; an obvious candidate would be utility per unit time equals the log of money spent per unit time, making sure to count things like imputed rent on owned property as spending. Once you have that, there’s an exact answer to the optimal risk/reward tradeoff in investments, how much to save/borrow, etc.
I don’t intend this as a demand, but you may wish to edit your top comment.
As it stands, the first line of the first comment on this post is “Avoid this program.” Based on the comments in this thread it sounds like you think the program might be a good fit for some people.
Glad to have this term. I do think there’s a non-fallacious, superficially similar argument that goes something like this:
“X leads to Y. This is obvious, and the only way you could doubt it would be some sort of motivated reasoning—motivated by something other than preventing Y. Therefore, if you don’t think X leads to Y, you aren’t very motivated to prevent Y.”
It’s philosophically valid, but requires some very strong claims. I also suspect it’s prone to causing circular reasoning, where you’ve ‘proven’ that no one who cares about Y thinks X doesn’t lead to Y and then use that belief to discredit new arguments that X doesn’t lead to Y.
As someone with no knowledge of NNTP, I’m in favor of this sequence. As far as I’m concerned, much looks like on-topic craft/community material.
I didn’t downvote, but I don’t like your statement. I mostly agree with the biological facts, but you state them as if they apply directly and straightforwardly to the post’s question about human affairs. If applied in the most obvious way, they lead to the unfortunate implications, but I don’t think that application really makes sense. And I can’t help suspecting these apparent implications are a result of motivated stopping.
I think we need to provide some kind of prior regarding unknown features of model and reward if we want the given model and reward to mean anything. Otherwise, for all the AI knows, the true reward has a +2-per-step term that reverses the reward-over-time feature. It can still infer the algorithm generating the sample trajectories, but the known reward is no help at all in doing so.
I think what we want is for the stated reward to function as a hint. One interpretation might be to expect that the stated reward should approximate the true reward well over the problem and solution domains humans have thought about. This works in, for instance, the case where you put an AI in charge of the paper clip factory with the stated reward ‘+1 per paper clip produced’.
I’m not sure I’ve encountered these more advanced versions. is there a link?
I’ve made a few shots, e.g. at http://lesswrong.com/r/discussion/lw/mfq/presidents_asteroids_natural_categories_and/cjkr and http://lesswrong.com/lw/m25/high_impact_from_low_impact/cah1. There’s no explicit role-playing, but I was very much in the mindset of trying to break the protection scheme.
I haven’t been keeping up with these posts as well lately.
For the Chesterton’s Fence objection to properly have applied to the NHS, it would have had to have been the case that no one could explain the historical lack of NHS. Yet I think it’s pretty easily explained by governments’ values over time: first kleptocratic, then libertarianish, and only becoming utilitarian roughly around the time of the NHS, to simplify heavily.
Exactly how established is the track record of taking down fences without an understanding of why they were put up? A great many of liberalism’s target fences over the years have been readily explained by being in the interests of the powerful (e.g. monarchy/aristocracy, slavery).
That’s essentially what these posts are to me, except instead of a video game it’s pen-and-paper with Stuart Armstrong as DM :).
It might be worth the extra motivation of writing up a framing with evil AI designers applying the proposed controls. I’ll consider doing this on future posts.
The desire to look at calibration rather than prediction-score comes from the fact that calibration at least kind of seems like something you could fairly compare across different prediction sets. Comparing Scott’s 2015 vs. 2014 prediction scores might just reflect which year had more predictable events. In theory it’s also possible that one year’s uncertainties are objectively harder to calibrate, but this seems less likely.
The best procedure is probably to just make a good-faith effort to choose predictions based on interest and predict as though maximizing prediction score. If one wanted to properly align incentives, one might try the following procedure: 1) Announce a set of things to predict, but not the predictions themselves 2) Have another party pledge to reward you (with cash or charity donation, probably) in proportion to your prediction score*, with a multiplier based on how hard they think your prediction topics are. 3) Make your predictions.
There’s a bit of a hurdle in that the domain is negative infinity to zero. One solution would be to set a maximum allowed confidence to make the range finite—for instance, if 99% is the maximum, the worst possible score would be ln(0.01) =~ −4.6, so a reward of (4.6 + score) would produce the right incentives.
The moderator may be reacting to a pattern that’s clearly ban-worthy, but nonetheless hard to verbalize exactly, and thus misreport their real reason. Verbal reporting is hard.
This. If I read the ban announcement legalistically, I disagree with it. But if I read the offending post, together with multiple users’ assurances that AA’s posts were basically all like that—I don’t want that in my garden.
Actually, I notice that BBn vs. BBm is isomorphic to CBn vs. CBm! Just interchange ‘swerve’ and ‘don’t swerve’ in the specification of one to convert it into the other. This implies that BBn swerves against BBm, and BBm does not swerve, if my proof about CBn vs. CBm is valid. I’m no longer so sure it is...
Per AlexMennen, I’d rename CarefulBot-as-formulated to something like BoldBot, and apply the name CarefulBot to “swerve unless it’s proven the other bot swerves.”
I think (I’m an amateur here, full disclosure; could be mistakes) we can capture some of the behavior you’re interested in by considering bots with different levels of proof system. So e.g. CBn swerves unless it can prove, using PA+n, that the opponent swerves.
Then we see that for n > m, CBn(CBm) = D. Proof: Suppose CBn doesn’t swerve. In PA+n, it is an axiom that provability in PA+m implies truth. Therefore PA+m cannot prove that CBn swerves. In a contest between CarefulBots, the stronger proof system wins.
Now consider BoldBot. We can see that BBn does not swerve against BBn, because by symmetry both agents prove the same thing; and if they both proved their opponent does not swerve, they would both swerve, and both have proven a falsehood.
Analyzing BBn vs. BBm is proving difficult, but I’m going to try a bit more.
The rock wins at chicken, for any model that accurately describes its behavior. One such model is as an agent with a game-appropriate utility function and zero intelligence. Therefore, an agent with a game-appropriate utility function and zero intelligence wins at chicken (in the case as constructed).
It proves that we can construct a game where the less intelligent player’s lack of intelligence is an advantage. OP shows the same, but I find the rock example simpler and clearer—I especially find it illuminates the difficulties with trying to exploit the result.
We can model a rock either as having no preferences, but we can also model it as having arbitrary preferences—including the appropriate payout matrix for a given game—and zero ability to optimize the world to achieve them. We observe the same thing either way.
I’m still not sure which line you’re taking on this: A) Disputing the VNM formulation of rational behavior that a rational agent should maximize expected utility (https://en.wikipedia.org/wiki/Von_Neumann%E2%80%93Morgenstern_utility_theorem), or B) Disputing that we can write down an approximate utility function accurate enough to sufficiently capture our risk preferences.