DavidHolmes

Karma: 124

http://pub.math.leidenuniv.nl/~holmesdst/

DavidHolmes Feb 21, 2022, 9:02 PM
4 points
in reply to: interstice’s comment on: Understanding “Deep Double Descent”
Thank you for the quick reply! I’m thinking about section 5.1 on reparametrising the model, where they write:

every minimum is observationally equivalent to an infinitely sharp minimum and to an infinitely flat min- imum when considering nonzero eigenvalues of the Hessian;

If we stick to section 4 (and so don’t allow reparametrisation) I agree there seems to be something more tricky going on. I initially assumed that I could e.g. modify the proof of Theorem 4 to make a sharp minimum flat by taking alpha to be big, but it doesn’t work like that (basically we’re looking at alpha + 1/alpha, which can easily be made big, but not very small). So maybe you are right that we can only make flat minimal sharp and not conversely. I’d like to understand this better!

DavidHolmes Feb 15, 2022, 9:35 AM
3 points
in reply to: interstice’s comment on: Understanding “Deep Double Descent”
I’m not sure I agree with interstice’s reading of the ‘sharp minima’ paper. As I understand it, they show that a given function can be made into a sharp or flat minimum by finding a suitable point in the parameter space mapping to the function. So if one has a sharp minmum that does not generalise (which I think we will agree exists) then one can make the same function into a flat minimum, which will still not generalise as it is the same function! Sorry I’m 2 years late to the party...

DavidHolmes Sep 3, 2021, 12:09 PM
1 point
in reply to: frontier64’s comment on: Gravity Turn

if we gave research grants to smart and personable university graduates and gave them carte blanche to do with the money what they wished that would work just as well as the current system

This thought is not unique to you; see e.g. the French CNRS system. My impression is that it works kind of as you would expect; a lot of them go on to do solid work, some do great work, and a few stop working after a couple of years. Of course we can not really know how things would have turned out if the same people had been given more conventional positions,

DavidHolmes May 14, 2021, 6:27 AM
2 points
in reply to: ChristianKl’s comment on: Academia as Company Hierarchy
The request for elaboration concerned how the experience described related to the LCS hierarchy described in the post, which was (and remains) very unclear to me.

DavidHolmes May 14, 2021, 6:24 AM
4 points
in reply to: eukaryote’s comment on: There’s no such thing as a tree (phylogenetically)
Definitely the antagonistic bits—I enjoyed the casual style! Really just the line ‘ Sit down. Sit down. Shut up. Listen. You don’t know nothing yet’ I found quite off-putting—even though in hindsight you were correct!

DavidHolmes May 13, 2021, 7:08 AM
1 point
in reply to: jaspax’s comment on: Academia as Company Hierarchy
Thanks! I thought it might be, but was unsure, and didn’t want to make an awkward situation for the OP in case it was something very different...

DavidHolmes May 13, 2021, 6:51 AM
10 points
on: There’s no such thing as a tree (phylogenetically)
I really liked the content, but I found some of the style (`Sit down!′ etc) really off-putting, which I why I only actually read the post on my 3rd attempt. Obviously you’re welcome to write in whatever style you want, and probably lots of other people really like it, I just thought it might be useful to mention that a non-empty set of people find it off-putting.

DavidHolmes May 13, 2021, 6:39 AM
1 point
in reply to: Mary Chernyshenko’s comment on: Academia as Company Hierarchy
Can you elaborate on this a bit? I’m sorry to hear that you had a bad experience during fieldwork, though I’m afraid I’m not certain what you refer to by ‘Active Personal Life’. Can you explain how the experience you relate connects to the LCS hierarchy?

DavidHolmes May 13, 2021, 6:33 AM
3 points
on: Academia as Company Hierarchy
I’m sceptical of your decision to treat tenured and non-tenured faculty alike. As tenured faculty, this has long seemed to me to be perhaps the most important distinction.

More generally, what you write here is not very consistent with my own experience of academia (which is in mathematics and in Europe, though I have friends and collaborators in other countries and fields, so I am not totally clueless about how things work there).

Some points I am not seeing in your post are:
1. For many academics, being able to do their own research and work with brilliant students is their primary motivation. Grants etc are mainly valuable in how they facilitate that. This makes for a confusing situation where ‘losers’ in the original LCS model do the minimum work necessary for their paycheck, whereas ‘losers’ in the academic system (as you seem to be defining them?) do the maximum work that is compatible with their health and personal situation. Not only is this conceptually confusing to me, it also means that all other things being equal, the more `losers’ one is in academia the more impressive one’s CV will tend to be. Which is I think the opposite of the situation in the conventional LCS hierarchy?
2. The fact that I ‘perform peer review for nothing at all’ apparently makes me clueless. But this is weird; it does not go on my CV, and I do it because I think it is important to the advancement of science. Surely this makes it a `loser’ activity?
3. Acceptance of papers and awarding of grants is decided by people external to your university. This makes a huge difference, and I think you miss it by writing `So we might analyze this system at the department level, at the university level, or at the all-academia level, but it doesn’t make much of a difference.’.
Perhaps the above makes it sound as if I view academia as an organisational utopia; this is far from the case! But I do not think this post does a good job of identifying problems. I think a post analysing moral mazes in academia would be interesting, but I’m not convinced that the LCS hierarchy is an appropriate model, and this attempt to apply it does not seem to me to make useful category distinctions.

DavidHolmes Feb 16, 2021, 10:07 PM
4 points
on: Generalised models as a category

So the set of worlds, $W$ , is the set of functions from $F$ to …

I guess the $F$ should be a $¯ F$ ? Also, you don’t seem to define $E$ ; perhaps $E = W$ ?

DavidHolmes Feb 15, 2021, 7:56 AM
8 points
in reply to: weft’s comment on: Your Cheerful Price

I expect most people on LW to be okay being asked their Cheerful Price to have sex with someone.

I find this a surprising assertion. It does not apply to me, probably it does apply to you. Ordinarily I would ask if you had any other data points, but I don’t want to take the conversation in this direction...

DavidHolmes Aug 29, 2019, 7:23 AM
3 points
in reply to: Gordon Seidoh Worley’s comment on: Categorial preferences and utility functions
Sure, in the end we only really care about what comes top, as that’s the thing we choose. My feeling is that information on (relative) strengths of preferences is often available, and when it is available it seems to make sense to use it (e.g. allowing circumvention of Arrow’s theorem).

In particular, I worry that, when we only have ordinal preferences, the outcome of attempts to combine various preferences will depend heavily on how finely we divide up the world; by using information on strengths of preferences we can mitigate this.

DavidHolmes Aug 16, 2019, 8:11 AM
1 point
in reply to: Stuart_Armstrong’s comment on: Toy model piece #1: partial preferences revisited

(actually, my formula doubles the numbers you gave)

Are you sure? Suppose we take $W = W_{1} ⊔ W_{2}$ with $W_{1} = {A, B, C, D}$ , $W_{2} = {X, Y, Z}$ , then $n_{1} = 3$ , so the values for $W_{1}$ should be $- 3, - 1, 1, 3$ as I gave them. And similarly for $W_{2}$ , giving values $- 2, 0, 2$ . Or else I have mis-understood your definition?

I’d simply see that as two separate partial preferences

Just to be clear, by “separate partial preference” you mean a separate preorder, on a set of objects which may or may not have some overlap with the objects we considered so far? Then somehow the work is just postponed to the point where we try to combine partial preferences?

EDIT (in reply to your edit): I guess e.g. keeping conditions 1,2,3 the same and instead minimising
$g (G) = \sum_{w \leftarrow w^{'}} λ_{w \leftarrow w^{'}} (U (w^{'}) - U (w))^{2},$
where $λ_{w \leftarrow w^{'}} \in R_{> 0}$ is proportion to the reciprocal of the strength of the preference? Of course there are lots of variants on this!

DavidHolmes Aug 11, 2019, 4:38 PM
1 point
on: Toy model piece #1: partial preferences revisited
This seems really neat, but it seems quite sensitive to how one defines the worlds under consideration, and whether one counts slightly different worlds as actually distinct. Let me try to illustrate this with an example.

Suppose we have a $W$ consisting of 7 worlds, $W = {A, B, C, D, X, Y, W}$ , with preferences
$A < B < C < D, X < Y < Z$ and no other non-trivial preferences. Then (from the `sensible case’), I think we get the following utilities:
$A \mapsto - 3$
$X \mapsto - 2$
$B \mapsto - 1$
$Y \mapsto 0$
$C \mapsto 1$
$Z \mapsto 2$
$D \mapsto 3$ .

Suppose now that I create two new copies $X^{'}$ , $X^{''}$ of the world $X$ which each differ by the position of a single atom, so as to give me (extremely weak!) preferences $X^{''} < X^{'} < X$ , so all the non-trivial preferences in the new $W$ are now summarised as $A < B < C < D, X^{''} < X^{'} < X < Y < Z .$

Then the resulting utilities are (I think):
$X^{''} \mapsto - 4$
$A \mapsto - 3$
$X^{'} \mapsto - 2$
$B \mapsto - 1$
$X \mapsto 0$
$C \mapsto 1$
$Y \mapsto 2$
$D \mapsto 3$
$Z \mapsto 4$ .

In particular, before adding in these ‘trivial copies’ we had $U (Z) < U (D)$ , and now we get $U (D) < U (Z)$ . Is this a problem? It depends on the situation, but to me it suggests that, if using this approach, one needs to be careful in how the worlds are specified, and the ‘fine-grainedness’ needs to be roughly the same everywhere.

DavidHolmes Aug 11, 2019, 4:31 PM
1 point
in reply to: Stuart_Armstrong’s comment on: Categorial preferences and utility functions
Thanks! I like the way your optimisation problem handles non-closed cycles.

I think I’m less comfortable with how it treats disconnected components—as I understand it you just translate each separately to have `centre of mass’ at 0. If one wants to get a utility function out at the end one has to make some kind of choice in this situation, and the choice you make is probably the best one, so in that sense it seems very good.

But for example it seems vulnerable to creating ‘virtual copies’ of worlds in order to shift the centre of mass and push connected components one way or the other. That was what started me thinking about including strength of preference—if one adds to your setup a bunch of virtual copies of a world between which one is `almost indifferent’ then it seems it will shift the centre of mass, and thus the utility relative to come other chain. Of course, if one is actually indifferent then the ‘virtual copies’ will be collapsed to a single point in your $¯ ¯¯¯¯ ¯ W$ , but if they are just extremely close then it seems it will affect the utility relative to some other chain. I’ll try to explain this more clearly in a comment to your post.

DavidHolmes Aug 11, 2019, 3:59 PM
1 point
in reply to: Charlie Steiner’s comment on: Categorial preferences and utility functions
Thanks for the comment Charlie.

If I am indifferent to a gamble with a probability $1$ of ice cream, and a probability 0.8 of chocolate cake and 0.2 of going hungry

To check I understand correctly, you mean the agent is indifferent between the gambles (probability $1$ of ice cream) and (probability 0.8 of chocolate cake, probability 0.2 of going hungry)?

If I understand correctly, you’re describing a variant of Von Neumann–Morgenstern where instead of giving preferences among all lotteries, you’re specifying a certain collection of special type of pairs of lotteries between which the agent is indifferent $^{1}$ , together with a sign to say in which `direction’ things become preferred? It seems then likely to me that the data you give can be used to reconstruct preferences between all lotteries...

If one is given information in the form you propose but only for an incomplete' set of special triples (c.f.weak preferences’ above), then one can again ask whether and in how many ways it can be extended to a complete set of preferences. It feels to me as if there is an extra ambiguity coming in with your description, for example if the set of possible outcomes has 6 elements and I am given the value of the Betterness function on two disjoint triples, then to generate a utility function I have to not only choose a `translation’ between the two triples, but also a scaling. But maybe this is better/more realistic!

$1$ . By `special types’, I mean indifference between pairs of gambles of the form
(probability $1$ of A) vs (probability $p$ of B and probability $(1 - p)$ of C)
for some $0 \leq p \leq 1$ , and possible outcomes A, B, C. Then the sign says that I prefer higher probability of B (say).

DavidHolmes Aug 10, 2019, 12:16 AM
3 points
on: Toy model piece #1: partial preferences revisited
Thanks for pointing me to this updated version :-). This seems a really neat trick for writing down a utility function that is compatible with the given preorder. I thought a bit more about when/to what extent such a utility function will be unique, in particular if you are given not only the data of a preorder, but also some information on the strengths of the preferences. This ended up a bit too long for a comment, so I wrote a few things in outline here:
https://www.lesswrong.com/posts/7ncFy84ReMFW7TDG6/categorial-preferences-and-utility-functions
It may be quite irrelevant to what you’re aiming for here, but I thought it was maybe worth writing down just in case.

Categorial preferences and utility functions

DavidHolmesAug 9, 2019, 9:36 PM

10 points

6 comments5 min readLW link

DavidHolmes Aug 9, 2019, 7:48 PM
1 point
in reply to: Stuart_Armstrong’s comment on: Partial preferences and models
Never mind—I had fun thinking about this :-).

DavidHolmes Aug 9, 2019, 3:48 AM
12 points
on: Partial preferences and models
Hi Stuart,
I’m working my way through your `Research Agenda v0.9’ post, and am therefore going through various older posts to understand things. I wonder if I could ask some questions about the definition you propose here?
First, that $X$ be contained in $R^{N}$ for some $N$ seems not so relevant; can I just assume X, Y and Z are some manifolds ( $C^{k}$ for some $0 \leq k \leq \infty$ )? And we are given some partial order $≺$ on X, so that we can refer to `being a better world’?
Then, as I understand it, your definition says the following:
Fix X, $≺$ and Z. Let Y be a manifold and $y_{+}$ , $y_{-} \in Y$ . Given a local homomorphism $+ : Y \times Z \to X$ , we say that $y_{+}$ is partially preferred to $y_{-}$ if for all $z \in Z$ , we have $y_{-} + z ≺ y_{+} + z$ .
I’m not sure which inequalities should be strict, but this seems non-essential for now. On the other hand, the dependence of this definition on the choice of Y seems somewhat subtle and interesting. I will try to illustrate this in what follows.
First, let us make a new definition. Fix X, $≺$ , and Z as before. Let $Y^{'} = {y_{+}, y_{-}}$ , a two-element set equipped with the discrete topology, and let $+^{'} : Y \times Z \to X$ be an immersion of $C^{k}$ -manifolds. We say that $y_{+}$ is weakly partially preferred to $y_{-}$ if for all $z \in Z$ , we have $y_{-} +^{'} z ≺ y_{+} +^{'} z$ .
First, it is clear that partial preference implies weak partial preference. More formally:
Claim 1: Fix X, $≺$ and Z. Suppose we have a manifold Y, points $y_{+}$ , $y_{-} \in Y$ , and a local homomorphism $+ : Y \times Z \to X$ such that $y_{+}$ is partially preferred to $y_{-}$ . Setting $Y^{'} = {y_{+}, y_{-}}$ with the subspace topology from $Y$ (i.e. discrete), and taking $+^{'}$ to be the restriction of $+$ from $Y \times Z$ to $Y^{'} \times Z$ , we have that $y_{+}$ is weakly partially preferred to $y_{-}$ .
Proof: obvious. $\qed$
However, the converse can fail if Z is not contractible. First, let’s prove that the concepts are equivalent for Z contractible:
Claim 2: Fix X, $≺$ and Z, and assume that Z is contractible. Suppose we have a two-element set $Y^{'} = {y_{+}, y_{-}}$ and a map $+^{'} : Y^{'} \times Z \to X$ making $y_{+}$ weakly partially preferred to $y_{-}$ . Then there exist a manifold Y, an injection $Y^{'} \to Y$ , and a local homeomorphism $+ : Y \times Z \to X$ whose restriction to $Y^{'} \times Z$ is $+^{'}$ , making $y_{+}$ partially preferred to $y_{-}$ .
Proof: Let’s assume for simplicity of notation that X is equidimensional, say of dimension $d_{X}$ , and write $d_{Z}$ for the dimension of Z. Let Y be the disjoint union of two open balls of dimension $d_{X} - d_{Z}$ , with $Y^{'} \to Y$ the inclusion of the centres of the balls. Then take an $ϵ$ -neighbourhood of Z in X; it is diffeomorphic to $Y \times Z$ since the normal bundle to Z in X is trivialisable (c.f. https://math.stackexchange.com/questions/857784/product-neighborhood-theorem-with-boundary). $\qed$
If we want examples where weak partial preference and partial preference don’t coincide, we should look for an example where Z is not contractible, and its normal bundle in X is not contractible.
Example 3: Let X be the disjoint union of two moebius bands, and let Z be a circle. Note that including Z along the centre of either band gives a submanifold whose tubular neighbourhood is not a product. Assume that $≺$ is such that one component of X is preferred to the other (and $≺$ is indifferent within each connected component). Then take $Y^{'} = {y_{+}, y_{-}}$ , and $+^{'} : Y^{'} \times Z \to X$ to be the inclusion of the two circles along the centres of the two moebius bands, such that ${y_{+}} \times Z$ ends up in the preferred band. This yields a situation where $y_{+}$ is weakly partially preferred to $y_{-}$ , but the conclusion of Claim 2 fails, i.e. this cannot be extended to a partial preference for $y_{+}$ over $y_{-}$ .
What conclusion should we draw from this? To me, it suggests that the notion of partial preference is not yet quite as one would want. In the setting of Example 3, where X consists of two moebius strips, one of which is preferred to the other, then landing in the preferred strip should be preferred to landing in the un-preferred strip?! And yet the `local homeomorphism from a product’ condition gets in the way. This example is obviously quite artificial, and maybe analogous things cannot occur in reality. But I’m not so happy with this as an answer, since our approaches to AI safety should be (so far as possible) robust against the flaws in our understanding of physics.
Apologies for the overly-long comment, and for the imperfect LaTeX (I’ve not used this type of form much before).

DavidHolmes

Cat­e­go­rial prefer­ences and util­ity functions

Categorial preferences and utility functions