To keep the limits of the log argument in mind, log 50k is 10.8 and log (50k+70k) is 11.69 and log 1 billion is 20.7
Comparing these numbers tells you pretty much nothing. First of all, taking log($50k) is not a valid operation; you should only ever take logs of a dimensionless quantity. The standard solution is to pick an arbitrary dollar value $X, and compare log($50k/$X), log($120k/$X), and log($10^9/$X). This is equivalent to comparing 10.8 + C, 11.69 + C, and 20.7 + C, where C is an arbitrary constant.
This shouldn’t be a surprise, because under the standard definition, utility functions are translation-invariant. They are only compared in cases such as “is U1 better than U2?” or “is U1 better than a 50⁄50 chance of U2 and U3?” The answer to this question doesn’t change if we add a constant to U1, U2, and U3.
In particular, it’s invalid to say “U1 is twice as good as U2”. For that matter, even if you don’t like utility functions, this is suspicious in general: what does it mean to say “I would be twice as happy if I had a million dollars”?
It would make sense to say, if your utility for money is logarithmic and you currently have $50k, that you’re indifferent between a 100% chance of an extra $70k and a 8.8% chance of an extra $10^9 -- that being the probability for which the expected utilities are the same. If you think logarithmic utilities are bad, this is the claim you should be refuting.
Taking logs of a dimensionful quantity is possible, if you know what you’re doing. (In math, we make up our own rules: no one is allowed to tell us what we can and cannot do. Whether or not it’s useful is another question.) Here’s the real scoop:
In physics, we only really and truly care about dimensionless quantities. These are the quantities which do not change when we change the system of units, i.e. they are “invariant”. Anything which is not invariant is a purely arbitrary human convention, which doesn’t really tell me anything about the world. For example, if I want to know if I fit through a door, I’m only interested in the ratio between my height and the height of the door. I don’t really care about how the door compares to some standard meter somewhere, except as an intermediate step in some calculation.
Nevertheless, for practical purposes it is convenient to also consider quantities which transform in a particularly simple way under a change of units systems. Borrowing some terminology from general relativity, we can say that a quantity X is “covariant” if it transforms like X --> (unit1 / unit2 )^p X when we change from unit1 to unit2. Here p is a real number which indicates the dimension of the unit. These things aren’t invariant under a change of units, so we don’t care about them in a fundamental way. But they’re extremely useful nevertheless, because you can construct invariant quantities out of covariant ones by multiplying or dividing them in such a way that the units cancel out. (In the concrete example above, this allows us to measure the door and me separately, and wait until later to combine the results.)
Once you’re willing to accept numbers which depend on arbitrary human convention, nothing prevents you from taking logs or sines or whatever of these quantities (in the naive way, by just punching the number sans units into your calculator). What you end up with is a number which depends in a particularly complicated way on your system of units. Conceptually, that’s not really any worse. But remember, we only care if we can find a way to construct invariant quantities out of them. Practically speaking, our exprience as physicists is that quantities like this are rarely useful.
But there may be exceptions. And logs aren’t really that bad, since as Kindly points out, you can still extract invariant quantities by adding them together. As a working physicist I’ve done calculations where it was useful to think about logs of dimensionful quantities (keywords: “entanglement entropy”, “conformal field theory”). Sines are a lot worse since they aren’t even monotonic functions: I can’t imagine any application where taking the sine of a dimensionful quantity would be useful.
Right, but then log (2 apple) = log 2 + log apple and so forth. This is a perfectly sensible way to think about things as long as you (not you specifically, but the general you) remember that “log apple” transforms additively instead of multiplicatively under a change of coordinates.
I can’t imagine any application where taking the sine of a dimensionful quantity would be useful.
Machine learning methods will go right ahead and apply whatever collection of functions they’re given in whatever way works to get empirically accurate predictions from the data. E.g. add the patient’s temperature to their pulse rate and divide by the cotangent of their age in decades, or whatever.
So it can certainly be useful. Whether it is meaningful is another matter, and touches on this conundrum again. What and whence is “understanding” in an AGI?
Eliezer wrote somewhere about hypothetically being able to deduce special relativity from seeing an apple fall. What sort of mechanism could do that? Where might it get the idea that adding temperature to pulse may be useful for making empirical predictions, but useless for “understanding what is happening”, and what does that quoted phrase mean, in terms that one could program into an AGI?
I don’t think “homomorphism” is quite the right word here. Keeping track of units means keeping track of various scaling actions on the things you’re interested in; in other words, it means keeping track of certain symmetries. The reason you can use this for error-checking is that if two things are equal, then any relevant symmetries have to act on them in the same way. But the units themselves aren’t a homomorphism, they’re just a shorthand to indicate that you’re working with things that transform in some nontrivial way under some symmetry.
I don’t think “homomorphism” is quite the right word here.
The map from dimensional quantities to units is structure-preserving, so yes, it is a homomorphism between something like rings. For example, all distances in SI are mapped into the element “meter”, and all time intervals into the element “second”. Addition and subtraction is trivial under the map (e.g. m+m=m), and so is multiplication by a dimensionless quantity, while multiplication and division by a dimensional quantity generates new elements (e.g. meter per second).
Converting between different measurement systems (e.g. SI and CGS) adds various scale factors, thus enlarging the codomain of the map.
I don’t know of any good explanations; this seems relevant but requires a subscription to access. Unfortunately, no-one’s ever explained this to me either, so I’ve had to figure it out by myself.
What I’d add to the discussion you linked to is that in actual practice, logarithms appear in equations with units in them when you solve differential equations, and ultimately when you take integrals. In the simplest case, when we’re integrating 1/x, x can have any units whatsoever. However, if you have bounds A and B, you’ll get log(B) - log(A), which can be rewritten as log(B/A). There’s no way A and B can have different units, so B/A will be dimensionless.
Of course, often people are sloppy and will just keep doing things with log(B) and log(A), even though these don’t make sense by themselves. This is perfectly all right because the logs will have to cancel eventually. In fact, at this point, it’s even okay to drop the units on A and B, because log(10 ft) - log(5 ft) and log(10 m) - log(5 m) represent the same quantity.
I don’t know of any good explanations; this seems relevant but requires a subscription to access.
Most of that paper is the authors rebutting what other people have said about the issue, but there are two bits that try to explain why one can’t take logs of dimensional things.
Page 68 notes that
y=logbxifx=by, which “precludes the association of any physical dimension to any of the three variables b, x, and y”.
And on pages 69-70:
The reason for the necessity of including only dimensionless real numbers in the
arguments of transcendental function is not due to the [alleged] dimensional nonhomogeneity of the Taylor expansion, but rather to the lack of physical meaning of including dimensions and units in the arguments of these function. This distinction must be clearly made to students of physical sciences early in their undergraduate education.
That second snippet is too vague for me. But I’m still thinking about the first one.
The (say) real sine function is defined such that its domain and codomain are (subsets of) the reals. The reals are usually characterized as the complete ordered field. I have never come across units that—taken alone—satisfy the axioms of a complete ordered field, and having several units introduces problems such as how we would impose a meaningful order. So a sine function over unit-ed quantities is sufficiently non-obvious as to require a clarification of what would be meant by sin($1). For example—switching over now to logarithms—if we treat $1 as the real multiplicative identity (i.e. the real number, unity) unit-multiplied by the unit $, and extrapolate one of the fundamental properties of logarithms—that log(ab)=loga+logb, we find that log($1)=log($)+log(1)=log($) (assuming we keep that log(1)=0). How are we to interpret log($)? Moreover, log($^2)=2log($). So if I log the square of a dollar, I obtain twice the log of a dollar. How are we to interpret this in the above context of utility? Or an example from trigonometric functions: One characterization of the cosine and sine stipulates that cos^2+sin^2=1, so we would have that cos^2($1)+sin^2($1)=1. If this is the real unity, does this mean that the cosine function on dollars outputs a real number? Or if the RHS is $1, does this mean that the cosine function on dollars outputs a dollar^(1/2) value? Then consider that double, triple, etc. angles in the standard cosine function can be written as polynomials in the single-angle cosine. How would this translate?
So this is a case where the ‘burden of meaningfulness’ lies with proposing a meaningful interpretation (which now seems rather difficult), even though at first it seems obvious that there is a single reasonable way forward. The context of the functions needs to be considered; the sine function originated with plane geometry and was extended to the reals and then the complex numbers. Each of these was motivated by an (analytic) continuation into a bigger ‘domain’ that fit perfectly with existing understanding of that bigger domain; this doesn’t seem to be the case here.
How are we to interpret [the logarithm of one dollar] in the above context of utility?
You pick an arbitrary constant A of dimension “amount of money”, and use log(x/A) as an utility function. Changing A amounts to adding a constant to the utility (and changing the base of the logarithms amounts to multiplying it by a constant), which doesn’t affect expected utility maximization. EDIT: And once it’s clear that the choice of A is immaterial, you can abuse notation and just write “log(x)”, as Kindly says.
You can only add, subtract and compare like quantities, but log(50000*1dollar)=log(50000)+log(1 dollar), which is a meaningless expression. What’s the logarithm of a dollar?
Well, we could choose factorise it as log(50000 dollars) = log(50000 dollar^0.5 * 1 dollar^0.5) = log(50000 dollar^0.5) + log(1 dollar^0.5). That does keep the units of the addition operands the same. Now we only have to figure out what the log of a root-dollar is...
the logarithm of a dollar
It’s really just the same question again—why can’t I write log(1 dollar) = 0 (or maybe 0 dollar^0.5), the same as I would write log(1) = 0.
This is equivalent to comparing 10.8 + C, 11.69 + C, and 20.7 + C, where C is an arbitrary constant.
This is what I did, without the pedantry of the C.
In particular, it’s invalid to say “U1 is twice as good as U2”. For that matter, even if you don’t like utility functions, this is suspicious in general: what does it mean to say “I would be twice as happy if I had a million dollars”?
I don’t follow at all. How can utilities not be comparable in terms of multiplication? This falls out pretty much exactly from your classic cardinal utility function! You seem to be assuming ordinal utilities but I don’t see why you would talk about something I did not draw on nor would accept.
This is what I did, without the pedantry of the C.
The point is that because the constant is there, saying that utility grows logarithmically in money underspecifies the actual function. By ignoring C, you are implicitly using $1 as a point of comparison.
A generous interpretation of your claim would be to say that to someone who currently only has $1, having a billion dollars is twice as good as having $50000 -- in the sense, for example, that a 50% chance of the former is just as good as a 100% chance of the latter. This doesn’t seem outright implausible (having $50000 means you jump from “starving in the street” to “being more financially secure than I currently am”, which solves a lot of the problems that the $1 person has). However, it’s also irrelevant to someone who is guaranteed $50000 in all outcomes under consideration.
By comparing changes in utility as opposed to absolute values.
To the person with $50000, a change to $70000 would have a log utility of 0.336, and a change to $1 billion would have a log utility of 9.903. A change to $1 would have a log utility of −10.819.
How can utilities not be comparable in terms of multiplication?
“The utility of A is twice the utility of B” is not a statement that remains true if we add the same constant to both utilities, so it’s not an obviously meaningful statement. We can make the ratio come out however we want by performing an overall shift of the utility function. The fact that we think of utilities as cardinal numbers doesn’t mean we assign any meaning to ratios of utilities. But it seemed that you were trying to say that a person with a logarithmic utility function assesses $10^9 as having twice the utility of $50k.
The fact that we think of utilities as cardinal numbers doesn’t mean we assign any meaning to ratios of utilities.
Kindly says the ratios do have relevance to considering bets or risks.
But it seemed that you were trying to say that a person with a logarithmic utility function assesses $10^9 as having twice the utility of $50k.
Yes, I think I see my error now, but I think the force of the numbers is clear: log utility in money may be more extreme than most people would intuitively expect.
In particular, it’s invalid to say “U1 is twice as good as U2”. For that matter, even if you don’t like utility functions, this is suspicious in general: what does it mean to say “I would be twice as happy if I had a million dollars”?
This is what I immediately thought when I first read about the Repugnant Conclusion on Wikipedia, years ago before having ever heard of the VNM axioms or anything like that.
Comparing these numbers tells you pretty much nothing. First of all, taking log($50k) is not a valid operation; you should only ever take logs of a dimensionless quantity. The standard solution is to pick an arbitrary dollar value $X, and compare log($50k/$X), log($120k/$X), and log($10^9/$X). This is equivalent to comparing 10.8 + C, 11.69 + C, and 20.7 + C, where C is an arbitrary constant.
This shouldn’t be a surprise, because under the standard definition, utility functions are translation-invariant. They are only compared in cases such as “is U1 better than U2?” or “is U1 better than a 50⁄50 chance of U2 and U3?” The answer to this question doesn’t change if we add a constant to U1, U2, and U3.
In particular, it’s invalid to say “U1 is twice as good as U2”. For that matter, even if you don’t like utility functions, this is suspicious in general: what does it mean to say “I would be twice as happy if I had a million dollars”?
It would make sense to say, if your utility for money is logarithmic and you currently have $50k, that you’re indifferent between a 100% chance of an extra $70k and a 8.8% chance of an extra $10^9 -- that being the probability for which the expected utilities are the same. If you think logarithmic utilities are bad, this is the claim you should be refuting.
Goddammit I have a degree in mathematics and no-one ever told me that and I never figured it out for myself.
I see the beginnings of an explanation here [http://physics.stackexchange.com/questions/7668/fundamental-question-about-dimensional-analysis]. Any pointer to better explanation?
Taking logs of a dimensionful quantity is possible, if you know what you’re doing. (In math, we make up our own rules: no one is allowed to tell us what we can and cannot do. Whether or not it’s useful is another question.) Here’s the real scoop:
In physics, we only really and truly care about dimensionless quantities. These are the quantities which do not change when we change the system of units, i.e. they are “invariant”. Anything which is not invariant is a purely arbitrary human convention, which doesn’t really tell me anything about the world. For example, if I want to know if I fit through a door, I’m only interested in the ratio between my height and the height of the door. I don’t really care about how the door compares to some standard meter somewhere, except as an intermediate step in some calculation.
Nevertheless, for practical purposes it is convenient to also consider quantities which transform in a particularly simple way under a change of units systems. Borrowing some terminology from general relativity, we can say that a quantity X is “covariant” if it transforms like X --> (unit1 / unit2 )^p X when we change from unit1 to unit2. Here p is a real number which indicates the dimension of the unit. These things aren’t invariant under a change of units, so we don’t care about them in a fundamental way. But they’re extremely useful nevertheless, because you can construct invariant quantities out of covariant ones by multiplying or dividing them in such a way that the units cancel out. (In the concrete example above, this allows us to measure the door and me separately, and wait until later to combine the results.)
Once you’re willing to accept numbers which depend on arbitrary human convention, nothing prevents you from taking logs or sines or whatever of these quantities (in the naive way, by just punching the number sans units into your calculator). What you end up with is a number which depends in a particularly complicated way on your system of units. Conceptually, that’s not really any worse. But remember, we only care if we can find a way to construct invariant quantities out of them. Practically speaking, our exprience as physicists is that quantities like this are rarely useful.
But there may be exceptions. And logs aren’t really that bad, since as Kindly points out, you can still extract invariant quantities by adding them together. As a working physicist I’ve done calculations where it was useful to think about logs of dimensionful quantities (keywords: “entanglement entropy”, “conformal field theory”). Sines are a lot worse since they aren’t even monotonic functions: I can’t imagine any application where taking the sine of a dimensionful quantity would be useful.
I think it’d be obvious how to take the log of a dimensional quantity.
e^(log apple) = apple
Right, but then log (2 apple) = log 2 + log apple and so forth. This is a perfectly sensible way to think about things as long as you (not you specifically, but the general you) remember that “log apple” transforms additively instead of multiplicatively under a change of coordinates.
Isn’t the argument to a sine by default a quantity of angle, that is Radians in SI? (I know radians are epiphenomenal/w/e, but still)
Machine learning methods will go right ahead and apply whatever collection of functions they’re given in whatever way works to get empirically accurate predictions from the data. E.g. add the patient’s temperature to their pulse rate and divide by the cotangent of their age in decades, or whatever.
So it can certainly be useful. Whether it is meaningful is another matter, and touches on this conundrum again. What and whence is “understanding” in an AGI?
Eliezer wrote somewhere about hypothetically being able to deduce special relativity from seeing an apple fall. What sort of mechanism could do that? Where might it get the idea that adding temperature to pulse may be useful for making empirical predictions, but useless for “understanding what is happening”, and what does that quoted phrase mean, in terms that one could program into an AGI?
“units are a useful error-checking homomorphism”
I don’t think “homomorphism” is quite the right word here. Keeping track of units means keeping track of various scaling actions on the things you’re interested in; in other words, it means keeping track of certain symmetries. The reason you can use this for error-checking is that if two things are equal, then any relevant symmetries have to act on them in the same way. But the units themselves aren’t a homomorphism, they’re just a shorthand to indicate that you’re working with things that transform in some nontrivial way under some symmetry.
The map from dimensional quantities to units is structure-preserving, so yes, it is a homomorphism between something like rings. For example, all distances in SI are mapped into the element “meter”, and all time intervals into the element “second”. Addition and subtraction is trivial under the map (e.g. m+m=m), and so is multiplication by a dimensionless quantity, while multiplication and division by a dimensional quantity generates new elements (e.g. meter per second).
Converting between different measurement systems (e.g. SI and CGS) adds various scale factors, thus enlarging the codomain of the map.
I don’t know of any good explanations; this seems relevant but requires a subscription to access. Unfortunately, no-one’s ever explained this to me either, so I’ve had to figure it out by myself.
What I’d add to the discussion you linked to is that in actual practice, logarithms appear in equations with units in them when you solve differential equations, and ultimately when you take integrals. In the simplest case, when we’re integrating 1/x, x can have any units whatsoever. However, if you have bounds A and B, you’ll get log(B) - log(A), which can be rewritten as log(B/A). There’s no way A and B can have different units, so B/A will be dimensionless.
Of course, often people are sloppy and will just keep doing things with log(B) and log(A), even though these don’t make sense by themselves. This is perfectly all right because the logs will have to cancel eventually. In fact, at this point, it’s even okay to drop the units on A and B, because log(10 ft) - log(5 ft) and log(10 m) - log(5 m) represent the same quantity.
Most of that paper is the authors rebutting what other people have said about the issue, but there are two bits that try to explain why one can’t take logs of dimensional things.
Page 68 notes that y=logbx if x=by, which “precludes the association of any physical dimension to any of the three variables b, x, and y”.
And on pages 69-70:
That second snippet is too vague for me. But I’m still thinking about the first one.
[Edited to fix the LaTeX.]
The (say) real sine function is defined such that its domain and codomain are (subsets of) the reals. The reals are usually characterized as the complete ordered field. I have never come across units that—taken alone—satisfy the axioms of a complete ordered field, and having several units introduces problems such as how we would impose a meaningful order. So a sine function over unit-ed quantities is sufficiently non-obvious as to require a clarification of what would be meant by sin($1). For example—switching over now to logarithms—if we treat $1 as the real multiplicative identity (i.e. the real number, unity) unit-multiplied by the unit $, and extrapolate one of the fundamental properties of logarithms—that log(ab)=loga+logb, we find that log($1)=log($)+log(1)=log($) (assuming we keep that log(1)=0). How are we to interpret log($)? Moreover, log($^2)=2log($). So if I log the square of a dollar, I obtain twice the log of a dollar. How are we to interpret this in the above context of utility? Or an example from trigonometric functions: One characterization of the cosine and sine stipulates that cos^2+sin^2=1, so we would have that cos^2($1)+sin^2($1)=1. If this is the real unity, does this mean that the cosine function on dollars outputs a real number? Or if the RHS is $1, does this mean that the cosine function on dollars outputs a dollar^(1/2) value? Then consider that double, triple, etc. angles in the standard cosine function can be written as polynomials in the single-angle cosine. How would this translate?
So this is a case where the ‘burden of meaningfulness’ lies with proposing a meaningful interpretation (which now seems rather difficult), even though at first it seems obvious that there is a single reasonable way forward. The context of the functions needs to be considered; the sine function originated with plane geometry and was extended to the reals and then the complex numbers. Each of these was motivated by an (analytic) continuation into a bigger ‘domain’ that fit perfectly with existing understanding of that bigger domain; this doesn’t seem to be the case here.
You pick an arbitrary constant A of dimension “amount of money”, and use log(x/A) as an utility function. Changing A amounts to adding a constant to the utility (and changing the base of the logarithms amounts to multiplying it by a constant), which doesn’t affect expected utility maximization. EDIT: And once it’s clear that the choice of A is immaterial, you can abuse notation and just write “log(x)”, as Kindly says.
You can only add, subtract and compare like quantities, but log(50000*1dollar)=log(50000)+log(1 dollar), which is a meaningless expression. What’s the logarithm of a dollar?
An arbitrary additive constant. See the last paragraph of Kindly’s comment.
What do you need to “exponate” to get a dollar?
That, whatever that might be, is the logarithm of a dollar.
Well, we could choose factorise it as log(50000 dollars) = log(50000 dollar^0.5 * 1 dollar^0.5) = log(50000 dollar^0.5) + log(1 dollar^0.5). That does keep the units of the addition operands the same. Now we only have to figure out what the log of a root-dollar is...
It’s really just the same question again—why can’t I write log(1 dollar) = 0 (or maybe 0 dollar^0.5), the same as I would write log(1) = 0.
$1 = 100¢. Now try logging both sides by stripping off the currency units first!
This is what I did, without the pedantry of the C.
I don’t follow at all. How can utilities not be comparable in terms of multiplication? This falls out pretty much exactly from your classic cardinal utility function! You seem to be assuming ordinal utilities but I don’t see why you would talk about something I did not draw on nor would accept.
The point is that because the constant is there, saying that utility grows logarithmically in money underspecifies the actual function. By ignoring C, you are implicitly using $1 as a point of comparison.
A generous interpretation of your claim would be to say that to someone who currently only has $1, having a billion dollars is twice as good as having $50000 -- in the sense, for example, that a 50% chance of the former is just as good as a 100% chance of the latter. This doesn’t seem outright implausible (having $50000 means you jump from “starving in the street” to “being more financially secure than I currently am”, which solves a lot of the problems that the $1 person has). However, it’s also irrelevant to someone who is guaranteed $50000 in all outcomes under consideration.
Then how do you suggest the person under discussion evaluate their working patterns if log utilities are only useful for expected values?
By comparing changes in utility as opposed to absolute values.
To the person with $50000, a change to $70000 would have a log utility of 0.336, and a change to $1 billion would have a log utility of 9.903. A change to $1 would have a log utility of −10.819.
I see, thanks.
“The utility of A is twice the utility of B” is not a statement that remains true if we add the same constant to both utilities, so it’s not an obviously meaningful statement. We can make the ratio come out however we want by performing an overall shift of the utility function. The fact that we think of utilities as cardinal numbers doesn’t mean we assign any meaning to ratios of utilities. But it seemed that you were trying to say that a person with a logarithmic utility function assesses $10^9 as having twice the utility of $50k.
Kindly says the ratios do have relevance to considering bets or risks.
Yes, I think I see my error now, but I think the force of the numbers is clear: log utility in money may be more extreme than most people would intuitively expect.
This is what I immediately thought when I first read about the Repugnant Conclusion on Wikipedia, years ago before having ever heard of the VNM axioms or anything like that.