Fixed Point Discussion
Warning: This post contains some important spoilers for Topological Fixed Point Exercises, Diagonalization Fixed Point Exercises, and Iteration Fixed Point Exercises. If you plan to even try the exercises, reading this post will significantly reduce the value you can get from doing them.
Core Ideas
A fixed point of a function is an input such that . Fixed point theorems show that various types of functions must have fixed points, and sometimes give methods for finding those fixed points.
Fixed point theorems come in three flavors: Topological, Diagonal, and Iterative. (I sometimes refer to them by central examples as Brouwer, Lawvere, and Banach, respectively.)
Topological fixed points are non constructive. If is continuous, , and , then we know must have some fixed point between and , since must somewhere transition from being greater than to being less than . This does not tell us where it happens. This can be especially troublesome when there are multiple fixed points, and there is no principled way to choose between them.
Diagonal fixed points are constructed with a weird trick where you feed a code for a function into that function itself. Given a function , if you can construct a function , which on input , interprets as a function, runs on itself, and then runs on the result (i.e. .), then is a fixed point of because . This is not just an example; everything in the cluster looks like this. It is a weird trick, but it is actually very important.
Iterative fixed points can be found through iteration. For example, if , then starting with any value, iterating forever will converge to the unique fixed point .
(There is a fourth cluster in number theory discussed here, but I am leaving it out, since it does not seem relevant to AI, and because I am not sure whether to put it by itself or to tack it onto the topological cluster.)
Topological Fixed Points
Examples of topological fixed point theorems include Sperner’s lemma, the Brouwer fixed point theorem, the Kakutani fixed point theorem, the intermediate value theorem, and the Poincaré-Miranda theorem.
Brouwer is the central example of the cluster. Brouwer states that any continuous function from a compact convex set to itself has a fixed point.
Sperner’s Lemma is a discrete analogue which is used in one proof of Brouwer.
Kakutani is a strengthening of Brouwer to degenerate set valued functions, that look almost like continuous functions.
Poincaré-Miranda is an alternate formulation of Brouwer, which is about finding zeros rather than fixed points.
The Intermediate Value Theorem is a special case of Poincaré-Miranda. To a first approximation, you can think of all of these theorems as one big theorem.
Topological fixed point theorems also have some very large applications. The Kakutani fixed point theorem is used in game theory to show that Nash equilibria exist, and to show that markets have equilibrium prices! Sperner’s lemma is also used in some envy-free fair division results. Brouwer is also used is to show the existence of some differential equations.
In MIRI’s agent foundations work, Kakutani is used to construct probabilistic truth predicates and reflective oracles, and Brouwer is used to construct logical inductors.
These applications all use topological fixed points very directly, and so carry with them most of the philosophical baggage of topological fixed points. For example, while Nash equilibria exist, they are not unique, are computationally hard to find, and feel non-constructive and arbitrary.
Diagonal Fixed Points
Diagonal fixed point theorems are all centered around the basic structure of , where , as mentioned previously.
The pattern is used in many places.
In CS theory, it is used to construct quines and the Y-combinator in lambda calculus, and to prove Rice’s theorem and that the halting problem is undecidable.
In formal logic, it is used to prove the diagonal lemma and important corollaries, like Gödel’s incompleteness theorem, Löb’s theorem, and Tarski’s undefinability theorem. It is used to show the uncountability of the real numbers with Cantor’s Diagonal Argument.
Lawvere’s fixed point theorem is the most general version of the argument, and can be used to show all of the above as a corollary.
In MIRI’s agent foundations work, this shows up in the Löbian obstacle to self-trust, Löbian handshakes in Modal Combat and Bounded Open Source Prisoner’s Dilemma, as well as providing a basic foundation for why an agent reasoning about itself might make sense at all through quines.
Iterative Fixed Points
Iterative fixed point theorems are less of one cluster than the others; I will factor it into two sub-clusters, centered around the Banach fixed point theorem and Tarski fixed point theorem. (Each the same size as the original.)
The Tarski cluster is about fixed points of monotonic functions on (partially) ordered sets found by iteration. Tarski’s fixed point theorem states that any order preserving function on a complete lattice has a fixed point (and further the set of fixed points forms a complete lattice). The least fixed point can be found by starting with the least element and iterating the function transfinitely. This, for example, implies that every monotonic function on from to itself has a fixed point, even if it is not continuous. Kleene’s fixed point theorem strengthens the assumptions of Tarski by adding a form of continuity (and also removes some irrelevant assumptions), which gives us that the least fixed point can be found by iterating the function only ω times. The fixed point lemma for normal functions is similar to Kleene, but with ordinals rather than partial orders. It states that any strictly increasing continuous function on ordinals has arbitrarily large fixed points.
The Banach cluster is about fixed points of contractive functions on metric spaces found by iteration. A contractive function is a function that sends points closer together. A function is contractive if there exists an such that for all . Banach’s fixed point theorem state that any contractive function has a unique fixed point. This fixed point is for any starting point . An application of this to linear functions is that any ergodic stationary Markov chain has a stationary distribution (which is a fixed point of the transition map), which is converged to via iteration. This is also used in showing that correlated equilibria exist and can be found quickly. Banach can also be used to show that gradient descent converges exponentially quickly on a strongly convex function.
Interdisciplinary Nature
I think of Pure Mathematics as divided at the top into 5 subfields: Algebra, Analysis, Topology, Logic, and Combinatorics.
The mapping of the key fixed point theorems discussed in the exercises into these categories is surjective:
Lawvere’s fixed point theorem is Algebra
Banach’s fixed point theorem is Analysis
Brouwer fixed point theorem is Topology
Gödel’s first incompleteness theorem is Logic
Sperner’s lemma is Combinatorics.
On top of that, major applications of fixed point theorems show up in Differential Equations, CS theory, Machine Learning, Game Theory, and Economics.
Tomorrow’s AI Alignment Forum sequences post will be two short posts ‘Approval-directed bootstrapping’ and ‘Humans consulting HCH’ by Paul Christiano in the sequence Iterated Amplification.
The next posting in this sequence will be four posts of Agent Foundations research that use fixed point theorems, on Wednesday 28th November. These will be re-posts of content from the now-defunct Agent Foundations forum, all of whose content is now findable on the AI Alignment Forum (and all old links will soon be re-directed to the AI Alignment Forum).
- Alignment Newsletter #34 by 26 Nov 2018 23:10 UTC; 24 points) (
- 20 Jul 2020 14:28 UTC; 2 points) 's comment on What Would I Do? Self-prediction in Simple Algorithms by (
“(Each the same size as the original.)”
I was not expecting to laugh reading this. Well done—I just wish I hadn’t been in the middle of drinking my coffee.
Quote: “Topological fixed points are non constructive.”
Yeah. This fact is kind of crazy, I still haven’t made my peace with it. For example, there’s a computable continuous function from the unit square to itself without any computable fixed points.