I’m reviewing the books in the MIRI course list.

It’s been a while since I did a book review. The last book I reviewed was Computation and Logic, which I read in November. After that, I spent a few weeks brushing up on specific topics in preparation for my first MIRI math workshop. I read about half of The Logic of Provability and studied a little topology. I also worked my way through some relevant papers.

After the workshop, I took some time off around the holidays and wrote a bit about my experience. I’m finally back into Study Mode. This week I finished Linear Algebra Done Right, by Sheldon Axler.

I quite enjoyed the book. Linear algebra has far-reaching impact, and while I learned it in college, I was mostly just memorizing passwords. There are a few important concepts in linear algebra that seem prone to poor explanations. Linear Algebra Done Right derives these concepts intuitively. My understanding of Linear Algebra improved drastically as I read this book.

Below, I’ll review the contents of the text before giving a more detailed overview of the book as a whole.

My Background

When reading a review of a textbook, it’s important to know the reviewer’s background. I studied Linear Algebra briefly in college, not in a Linear Algebra course, but as a subsection of a Discrete Mathematics course. I knew about vector spaces, I understood what ‘linearity’ entailed, and I was acquainted with the standard tools of linear algebra. I had a vague idea that matrices encoded multiple linear equations at the same time. I could solve problems mechanically using the usual tools, but I didn’t fully understand them.

The book was an easy read (as I already knew the answers), but was still quite informative (as I didn’t yet understand them).

Vector Spaces
Finite Dimensional Vector Spaces
Linear Maps
Polynomials
Eigenvectors and Eigenvalues
Inner-Product Spaces
Operators on Inner-Product Spaces
Operators on Complex Vector Spaces
Operators on Real Vector Spaces
Trace and Determinant

Vector Spaces

This chapter briefly explains complex numbers, vectors (providing both geometric and non-geometric interpretations), and vector spaces. It’s short, and it’s a nice review. If you don’t know what vector spaces are, this is as good a way to learn as any. Even if you are already familiar with vector spaces, this chapter is worth a skim to learn the specific notation used in this particular book.

Finite Dimensional Vector Spaces

This chapter covers span, linear independence, bases, and dimension. There are some interesting results here if you’re new to the field (for example, any two vectors which are not scalar multiples of each other, no matter how close they are to each other, span an entire plane). Mostly, though, this section is about generalizing the basic geometric intuition for vector spaces into more-than-three dimensions.

Again, this chapter is probably a good introduction for people who have never seen Linear Algebra before, but it doesn’t cover much that is counter-intuitive.

Linear Maps

This is the first chapter that started explaining things in ways I hadn’t heard before. It covers linear maps from one vector space to another. It spends some time discussing null spaces (the vector spaces which a linear map maps to zero) and ranges (the subspace of the target space that the map maps the source space onto). These are fairly simple concepts that turn out to be far more useful than I anticipated when it comes to building up intuition about linear maps.

Next, matrices are explored. Given a linear map and a basis for each of the source and the target spaces, we can completely describe the linear map by seeing what it does to each basis vector. No matter how complicated the map is, no matter what gymnastics it is doing, linearity guarantees that we can always fully capture the behavior by pre-computing only what it does to the basis vectors. This pre-computation, ignoring the ‘actual function’ and seeing what it does to two specific bases, is precisely a matrix of the linear map (with respect to those bases).

That was a neat realization. The chapter then covers surjectivity, injectivity, and invertability, none of which are particularly surprising.

Polynomials

This chapter is spent exploring polynomials in both real and complex fields. This was a nice refresher, as polynomials become quite important (unexpectedly so) later on in the book.

Eigenvectors and Eigenvalues

The book now turns to operators (linear maps from a vector space onto itself), and starts analyzing their structure. It introduces invariant subspaces (spaces which, under the operator, map to some scalar multiple of themselves) and more specifically eigenvectors (one-dimensional invariant subspaces) and eigenvalues (the corresponding scalar multiples).

Interestingly, if T is an operator with eigenvalue λ then the null space of (T - λI) includes the corresponding eigenvector. This seemingly simple fact turns out to have far-reaching implications. This leads to our first taste of applying polynomials to operators, which is a portent of things to come.

The chapter then discusses some “well-behaved” operator/basis combinations and methods for finding bases under which operators are well behaved. This leads to upper-triangular and diagonal matrices.

The chapter concludes with a discussion of invariant subspaces on real vector spaces. This introduces some technical difficulties that arise in real spaces (namely stemming from the fact that not every polynomial has real roots). The methods for dealing with real spaces introduced here are repeated frequently in different contexts for the remainder of the book.

Inner-Product Spaces

This chapter introduces inner products, which essentially allows us to start measuring the ‘size’ of vectors (for varying definitions of ‘size’). This leads to a discussion of orthonormal bases (orthogonal bases where each basis vector has norm 1). Some of the very nice properties offered by orthogonal / orthonormal bases are then explored.

The chapter moves on to discuss linear functionals (linear maps onto a one-dimensional space) and adjoints. The adjoint of an operator is analogous to the complex conjugate of a complex number. Adjoints aren’t very well motivated in this chapter, but they allow us to discuss self-adjoint operators in the following chapter.

Operators on Inner-Product Spaces

This chapter opens with self-adjoint operators, which are essentially operators T such that the inner product of Tv with w is the same as the inner product of v with Tw. To continue with the analogy above, self-adjoint operators (which are “equal to their conjugate”) are analogous to real numbers.

Self-adjoint operators are generalized to normal operators, which merely commute with their adjoints. This sets us up for the spectral theorem, which allows us to prove some nice properties exclusive to normal operators on inner-product spaces. (Additional work is required for real spaces, as expected.)

The chapter moves on to positive operators, which should really be called non-negative operators, which essentially don’t turn any vectors around (with respect to the inner product) — specifically these are operators T such that ⟨Tv, v⟩ ≥ 0. This allows us to start talking about square roots of operators, which are always positive.

This is followed by isometries, which are operators that preserve norms (norm(Sv) = norm(v)).

With these two concepts in hand, the chapter shows that every operator on an inner product space can be decomposed into an isometry and a positive operator (which is the square of the square root of the original operator). Intuitively, this shows that every operator on an inner product space can be thought of as one positive operation (no turning vectors around) followed by one isometric operation (no changing lengths). This is called a polar decomposition, for obvious reasons.

The chapter concludes by showing that the polar decomposition leads to a singular value decomposition, which essentially states that every operator on an inner product space has a diagonal matrix with respect to two orthonormal bases. This is pretty powerful.

Operators on Complex Vector Spaces

The chapter begins by introducing generalized eigenvectors. Essentially, not every operator has as many eigenvectors as it has dimensions. However, in such operators, there will be some eigenvalues that are “repeated”: (T - λ)^2 maps additional subspaces to zero (that T - λ did not). More generally, we can assign a ‘multiplicity’ to each eigenvalue counting the maximum dimension of the null space of (T - λ)^j. Counting multiplicity, each operator on space V has as many eigenvalues as the dimension of V.

This allows us to characterize every operator via a unique polynomial p(T) = (T - λ_1)…(T - λ_n) such that p(T) = 0. In other words, we can think of T as a root of this polynomial, which is called the characteristic polynomial of T. This polynomial can tell us much about an operator (as it describes both the eigenvalues of the operator and their multiplicities).

We can also find a minimal polynomial for T, which is the monic polynomial q of minimal degree such that q(T) = 0. If the degree of q is equal to the dimension of the space that T operates on, then the minimal polynomial is the same as the characteristic polynomial. The minimal polynomial of an operator is also useful for analyzing the operator’s structure.

The chapter concludes by introducing Jordan form, a particularly ‘nice’ version of upper-triangular form available to ‘nilpotent’ operators (operators N such that N^p = 0 for some p).

Operators on Real Vector Spaces

This chapter derives characteristic polynomials for operators on real spaces. It’s essentially a repeat of the corresponding section of chapter 8, but with extra machinery to deal with real vector spaces. Similar (although predictably weaker) results are achieved.

Trace and Determinant

This chapter explains trace (the sum of all eigenvalues, counting multiplicity) and determinant (the product of all eigenvalues, counting multiplicity), which you may also recognize as the second and final coefficients of the characteristic polynomial, respectively. These values can tell you a fair bit about an operator (as they’re tightly related to the eigenvalues), and it turns out you can calculate them even when you can’t figure out the individual eigenvalues precisely.

The book then fleshes out methods for calculating trace and determinant from arbitrary matrices. It tries to motivate the standard method of calculating determinants from arbitrary matrices, but this involves a number of arbitrary leaps and feels very much like an accident of history. After this explanation, I understand better what a determinant is, and it is no longer surprising to me that my university courses had trouble motivating them.

The book concludes by exhibiting some situations where knowing only the determinant of an operator is actually useful.

Who should read this?

This book did a far better job of introducing the main concepts of linear algebra to me than did my Discrete Mathematics course. I came away with a vastly improved intuition for why the standard tools of linear algebra actually work.

I can personally attest that Linear Algebra Done Right is a great way to un-memorize passwords and build up that intuition. If you know how to compute a determinant but you have no idea what it means, then I recommend giving this book a shot.

I imagine that Linear Algebra Done Right would also be a good introduction for someone who hasn’t done any linear algebra at all.

What should I read?

Chapters 1, 2, and 4 are pretty simple and probably review. They are well-written, short, and at least worth a skim. I recommend reading them unless you feel you know the subjects very well already.

Chapters 3, 5, and 6 were very helpful, and really helped me build up my intuition for linear algebra. If you’re trying to build an intuition for linear algebra, read these three chapters closely and do the exercises (chapter 5 especially, it’s probably the most important chapter).

Chapters 7 and 8 also introduced some very helpful concepts, though things got fairly technical. They build up some good intuition, but they also require some grinding. I recommend reading these chapters and understanding all the concepts therein, but they’re pretty heavy on the exercises, which you can probably skip without much trouble. Chapter 8 is where the book deviates most from the “standard” way of teaching linear algebra, and might be worth a close read for that reason alone.

Chapter 9 is largely just a repeat of chapter 8 but on real spaces (which introduces some additional complexity — har har): none of it is surprising, most of it is mechanical, and I recommend skimming it and skipping the exercises unless you really want the practice.

Chapter 10 is dedicated to explaining some specific memorized passwords, and does a decent job of it. The sections about trace and determinants of an operator are very useful. The corresponding sections about traces and determinants of matrices are eye-opening from the perspective of someone coming from the “standard” classes, and are probably somewhat surprising from a newcomer’s perspective as well. (You can torture an impressive amount of information out of a matrix.) However, a good half of chapter 10 is an artifact of the “standard” way of teaching linear algebra: in my opinion, it’s sufficient to understand what trace and determinants are and know that they can be calculated from matrices in general, without worrying too much about the specific incantations. I’d skim most of this chapter and skip the exercises.

Closing Notes

This book is well-written. It has minimal typos and isn’t afraid to spend a few paragraphs building intuition, but it largely gets to the point and doesn’t waste your time. It’s a fairly quick read, it’s well-paced, and it never feels too difficult.

I feel this book deserves its place on the course list. Linear algebra is prevalent throughout mathematics, and Linear Algebra Done Right provides a solid overview.

Book Review: Linear Algebra Done Right (MIRI course list)