Following in the path of So8res and others, I’ve decided to work my way through the textbooks on the MIRI Research Guide. I’ve been working my way through the guide since last October, but this is my first review. I plan on following up this review with reviews of Enderton’s A Mathematical Introduction to Logicand Sipser’s Introduction to the Theory of Computation. Hopefully these reviews will be of some use to you.
Discrete Mathematics and Its Applications
Discrete Mathematics and Its Applications is wonderful, gentle introduction to the math needed to understand most of the other books on the MIRI course list. It successfully pulls off a colloquial tone of voice. It spends a lot of time motivating concepts; it also contains a lot of interesting trivia and short biographies of famous mathematicians and computer scientists (which the textbook calls “links”). Additionally, the book provides a lot of examples for each of its theorems and topics. It also fleshes out the key subjects (counting, proofs, graphs, etc.) while also providing a high level overview of their applications. These combine to make it an excellent first textbook for learning discrete mathematics.
However, for much the same reasons, I would not recommend it nearly as much if you’ve taken a discrete math course. People who’ve participated in math competitions at the high school level probably won’t get much out of the textbooks either. Even though I went in with only the discrete math I did in high school, I still got quite frustrated at times because of how long the book would take to get to the point. Discrete Mathematics is intended to be quite introductory and it succeeds in this goal, but it probably won’t be very suitable as anything other than review for readers beyond the introductory level. The sole exception is the last chapter (on models of computation), but I recommend picking up a more comprehensive overview from Sipser’s Theory of Computation instead.
I still highly recommend it for those not familiar with the topics covered in the book. I’ve summarized the contents of the textbook below:
This chapter introduces propositional (sentential) logic, predicate logic, and proof theory at a very introductory level. It starts by introducing the propositions of propositional logic (!), then goes on to introduce applications of propositional logic, such as logic puzzles and logic circuits. It then goes on to introduce the idea of logical equivalence between sentences of propositional logic, before introducing quantifiers and predicate logic and its rules of inference. It then ends by talking about the different kinds of proofs one is likely to encounter – direct proofs via repeated modus ponens, proofs by contradiction, proof by cases, and constructive and non-constructive existence proofs.
This chapter illustrates exactly why this book is excellent as an introductory text. It doesn’t just introduce the terms, theorems, and definitions; it motivates them by giving applications. For example, it explains the need for predicate logic by pointing out that there are inferences that can’t be drawn using only propositional logic. Additionally, it also explains the common pitfalls for the different proof methods that it introduces.
Basic Structures: Sets, Functions, Sequences, Sums, and Matrices
This chapter introduces the different objects one is likely to encounter in discrete mathematics. Most of it seemed pretty standard, with the following exceptions: functions are introduced without reference to relations; the “cardinality of sets” section provides a high level overview of a lot of set theory; and the matrices section introduces zero-one matrices, which are used in the chapters on relations and graphs.
Algorithms
This chapter presents … surprise, surprise… algorithms! It starts by introducing the notion of algorithms, and gives a few examples of simple algorithms. It then spends a page introducing the halting problem and showing its undecidability. (!) Afterwards, it introduces big-o, big-omega, and big-theta notation and then gives a (very informal) treatment of a portion of computation complexity theory. It’s quite unusual to see algorithms being dealt with so early into a discrete math course, but it’s quite important because the author starts providing examples of algorithms in almost every chapter after this one.
Number Theory and Cryptography
This section goes from simple modular arithmetic (3 divides 12!) to RSA, which I found extremely impressive. (Admittedly, I’ve only ever read one other discrete math textbook.) After introducing the notion of divisibility, the textbook takes the reader on a rapid tour through base-n notation, the fundamental theorem of arithmetic, the infinitude of primes, the Euclidean GCD algorithm, Bezout’s theorem, the Chinese remainder theorem, Fermat’s little theorem, and other key results of number theory. It then gives several applications of number theory: hash functions, pseudorandom numbers, check digits, and cryptography. The last of these gets its own section, and the book spends a large amount of it introducing RSA and its applications.
Induction and Recursion
This chapter introduces mathematical induction and recursion, two extremely important concepts in computer science. Proofs by mathematical induction, basically, are proofs that show that a property is true of the first natural number (positive integer in this book), and if it is true of an integer k it is true of k+1. With these two results, we can conclude that the property is true of all natural numbers (positive integers). The book then goes on to introduce strong induction and recursively defined functions and sets. From this, the book then goes on to introduce the concept of structural induction, which is a generalization of induction to work on recursively-defined sets. Then, the book introduces recursive algorithms, most notably the merge sort, before giving a high level overview of program verification techniques.
Counting
The book now changes subjects to talk about basic counting techniques, such as the product rule and the sum rule, before (interestingly) moving on to the pigeonhole principle. It then moves on to permutations and combinations, while introducing the notion of combinatorial proof, which is when we show that two sides of the identity count the same things but in different ways, or that there exists a bijection between the sets being counted on either side. The textbook then introduces binomial coefficients, Pascal’s triangle, and permutations/combinations with repetition. Finally, it gives algorithms that generate all the permutations and combinations of a set of n objects.
Compared to other sections, I feel that a higher proportion of readers would be familiar with the results of this chapter and the one on discrete probability that follows it. Other than the last section, which I found quite interesting but not particularly useful, I felt like I barely got anything from the chapter.
Discrete Probability
In this section the book covers probability, a topic that most of LessWrong should be quite familiar with. Like most introductory textbooks, it begins by introducing the notion of sample spaces and events as sets, before defining probability of an event E as the ratio of the cardinality of E to the cardinality of S. We are then introduced to other key concepts in probability theory: conditional probabilities, independence, and random variables, for example. The textbook takes care to flesh out this section with a discussion about the Birthday Problem and Monte Carlo algorithms. Afterwards, we are treated to a section on Bayes theorem, with the canonical example of disease testing for rare diseases and a less-canonical-but-still-used-quite-a-lot example of Naïve Bayes spam filters. The chapter concludes by introducing the expected value and variances of random variables, as well as a lot of key results (linearity of expectations and Chebyshev’s Inequality, to list two). Again, aside from the applications, most of this stuff is quite basic.
Advanced Counting Techniques
This chapter, though titled “advanced counting techniques”, is really just about recurrences and the principle of inclusion-exclusion. As you can tell by the length of this section, I found this chapter quite helpful nevertheless.
We begin by giving three applications of recurrences: Fibonacci’s “rabbit problem”, the Tower of Hanoi, and dynamic programming. We’re then shown how to solve linear homogenous relations, which are relations of the form
an = c1 an-1 + c2 an-2 + … + ck an-k+ F(n)
Where c1, c2, …, ck are constants, ck =/= 0, and F(n) is a function of n. The solutions are quite beautiful, and if you’re not familiar with them I recommend looking them up. Afterwards, we’re introduced to divide-and-conquer algorithms, which are recursive algorithms that solve smaller and smaller instances of the problem, as well as the master method for solving the recurrences associated with them, which tend to be of the form
f(n) = a f(n/b) + cnd
After these algorithms, we’re introduced to generating functions, which are yet another way of solving recurrences.
Finally, after a long trip through various recurrence-solving methods, the textbook introduces the principle of inclusion-exclusion, which lets us figure out how many elements are in the union of a finite number of finite sets.
Relations
Finally, 7 chapters after the textbook talks about functions, it finally gets to relations. Relations are defined as sets of n-tuples, but the book also gives alternative ways of representing relations, such as matrices and directed graphs for binary relations. We’re then introduced to transitive closures and Warshall’s algorithm for computing the transitive closure of a relation. We conclude with two special types of relations: equivalence relations, which are reflexive, symmetric, and transitive; and partial orderings, which are reflexive, anti-symmetric, and transitive.
Graphs
After being first introduced to directed graphs as a way of representing relations in the previous chapter, we’re given a much more fleshed out treatment in this chapter. A graph is defined as a set of vertices and a set of edges connecting them. Edges can be directed or undirected, and graphs can be simple graphs (with no two edges connecting the same pair of vertices) or multigraphs, which contain multiple edges connecting the same pair of vertices. We’re then given a ton of terminology related to graphs, and a lot of theorems related to these terms. The treatment of graphs is quite advanced for an introductory textbook – it covers Dijkstra’s algorithm for shortest paths, for example, and ends with four coloring. I found this chapter to be a useful review of a lot of graph theory.
Trees
After dealing with graphs, we move on to trees, or connected graphs that don’t have cycles. The textbook gives a lot of examples of applications of trees, such as binary search trees, decision trees, and Huffman coding. We’re then presented with the three ways of traversing a tree – in-order, pre-order, and post-order. Afterwards, we get to the topic of spanning trees of graphs, which are trees that contain every vertex in the graph. Two algorithms are presented for finding spanning trees – depth first search and breadth first search. The chapter ends with a section on minimum spanning trees, which are spanning trees with the least weight. Once again we’re presented with two algorithms for finding minimum spanning trees: Prim’s Algorithm and Kruskal’s algorithm. Having never seen either of these algorithms before, I found this section to be quite interesting, though they are given a more comprehensive treatment in most introductory algorithms textbooks.
Boolean Algebra
This section introduces Boolean algebra, which is basically a set of rules for manipulating elements of the set {0,1}. Why is this useful? Because, as it turns out, Boolean algebra is directly related to circuit design! The textbook first introduces the terminology and rules of Boolean algebra, and then moves on to circuits of logic gates and their relationship with Boolean functions. We conclude with two ways to minimize the complexity of Boolean functions (and thus circuits) – Karnaugh Maps and the Quine-McCluskey Method, which are both quite interesting.
Modeling Computation
This is the chapter of Rosen that I’m pretty sure isn’t covered by most introductory textbooks. In many ways, it’s an extremely condensed version of the first couple chapters of a theory of computation textbook. It covers phase structure grammars, finite state machines, and closes with Turing machines. However, I found this chapter a lot more poorly motivated than the rest of the book, and also that Sipser’s Introduction to the Theory of Computation offers a lot better introduction to these topics.
Who should read this?
If you’re not familiar with discrete mathematics, this is a great book that will get you up to speed on the key concepts, at least to the level where you’ll be able to understand the other textbooks on MIRI’s course list. Of the three textbooks I’m familiar with that cover discrete mathematics, I think that Rosen is hands down the best. I also think it’s quite a “fun” textbook to skim through, even if you’re familiar with some of the topics already.
However, I think that people familiar with the topics probably should look for other books, especially if they are looking for textbooks that are more concise. It might also not be suitable if you’re already really motivated to learn the subject, and just want to jump right in. There are a few topics not normally covered in other discrete math textbooks, but I feel that it’s better to pick up those topics in other textbooks.
What should I read?
In general, the rule for the textbook is: read the sections you’re not familiar with, and skim the sections you are familiar with, just to keep an eye out for cool examples or theorems.
In terms of chapter-by-chapter, chapters 1 and 2 seem like they’ll help if you’re new to mathematics or proofs, but probably can be skipped otherwise. Chapter 3 is pretty good to know in general, though I suspect most people here would find it too easy. Chapters 4 through 12 are what most courses on discrete mathematics seem to cover, and form the bulk of the book – I would recommend skimming them once just to make sure you know them, as they’re also quite important for understanding any serious CS textbook. Chapter 13, on the other hand, seems kind of tacked on, and probably should be picked up in other textbooks.
Final Notes
Of all the books on the MIRI research guide, this is probably the most accessible, but it is by no means a bad book. I’d highly recommend it to anyone who hasn’t had any exposure to discrete mathematics, and I think it’s an important prerequisite for the rest of the books on the MIRI research guide.
Book Review: Discrete Mathematics and Its Applications (MIRI Course List)
Following in the path of So8res and others, I’ve decided to work my way through the textbooks on the MIRI Research Guide. I’ve been working my way through the guide since last October, but this is my first review. I plan on following up this review with reviews of Enderton’s A Mathematical Introduction to Logic and Sipser’s Introduction to the Theory of Computation. Hopefully these reviews will be of some use to you.
Discrete Mathematics and Its Applications
Discrete Mathematics and Its Applications is wonderful, gentle introduction to the math needed to understand most of the other books on the MIRI course list. It successfully pulls off a colloquial tone of voice. It spends a lot of time motivating concepts; it also contains a lot of interesting trivia and short biographies of famous mathematicians and computer scientists (which the textbook calls “links”). Additionally, the book provides a lot of examples for each of its theorems and topics. It also fleshes out the key subjects (counting, proofs, graphs, etc.) while also providing a high level overview of their applications. These combine to make it an excellent first textbook for learning discrete mathematics.
However, for much the same reasons, I would not recommend it nearly as much if you’ve taken a discrete math course. People who’ve participated in math competitions at the high school level probably won’t get much out of the textbooks either. Even though I went in with only the discrete math I did in high school, I still got quite frustrated at times because of how long the book would take to get to the point. Discrete Mathematics is intended to be quite introductory and it succeeds in this goal, but it probably won’t be very suitable as anything other than review for readers beyond the introductory level. The sole exception is the last chapter (on models of computation), but I recommend picking up a more comprehensive overview from Sipser’s Theory of Computation instead.
I still highly recommend it for those not familiar with the topics covered in the book. I’ve summarized the contents of the textbook below:
Contents:
1. The Foundations: Logic and Proofs
2. Basic Structures: Sets, Functions, Sequences, Sums, and Matrices
3. Algorithms
4. Number Theory and Cryptography
5. Induction and Recursion
6. Counting
7. Discrete Probability
8. Advanced Counting Techniques
9. Relations
10. Graphs
11. Trees
12. Boolean Algebra
13. Modeling Computation
The Foundations: Logic and Proofs
This chapter introduces propositional (sentential) logic, predicate logic, and proof theory at a very introductory level. It starts by introducing the propositions of propositional logic (!), then goes on to introduce applications of propositional logic, such as logic puzzles and logic circuits. It then goes on to introduce the idea of logical equivalence between sentences of propositional logic, before introducing quantifiers and predicate logic and its rules of inference. It then ends by talking about the different kinds of proofs one is likely to encounter – direct proofs via repeated modus ponens, proofs by contradiction, proof by cases, and constructive and non-constructive existence proofs.
This chapter illustrates exactly why this book is excellent as an introductory text. It doesn’t just introduce the terms, theorems, and definitions; it motivates them by giving applications. For example, it explains the need for predicate logic by pointing out that there are inferences that can’t be drawn using only propositional logic. Additionally, it also explains the common pitfalls for the different proof methods that it introduces.
Basic Structures: Sets, Functions, Sequences, Sums, and Matrices
This chapter introduces the different objects one is likely to encounter in discrete mathematics. Most of it seemed pretty standard, with the following exceptions: functions are introduced without reference to relations; the “cardinality of sets” section provides a high level overview of a lot of set theory; and the matrices section introduces zero-one matrices, which are used in the chapters on relations and graphs.
Algorithms
This chapter presents … surprise, surprise… algorithms! It starts by introducing the notion of algorithms, and gives a few examples of simple algorithms. It then spends a page introducing the halting problem and showing its undecidability. (!) Afterwards, it introduces big-o, big-omega, and big-theta notation and then gives a (very informal) treatment of a portion of computation complexity theory. It’s quite unusual to see algorithms being dealt with so early into a discrete math course, but it’s quite important because the author starts providing examples of algorithms in almost every chapter after this one.
Number Theory and Cryptography
This section goes from simple modular arithmetic (3 divides 12!) to RSA, which I found extremely impressive. (Admittedly, I’ve only ever read one other discrete math textbook.) After introducing the notion of divisibility, the textbook takes the reader on a rapid tour through base-n notation, the fundamental theorem of arithmetic, the infinitude of primes, the Euclidean GCD algorithm, Bezout’s theorem, the Chinese remainder theorem, Fermat’s little theorem, and other key results of number theory. It then gives several applications of number theory: hash functions, pseudorandom numbers, check digits, and cryptography. The last of these gets its own section, and the book spends a large amount of it introducing RSA and its applications.
Induction and Recursion
This chapter introduces mathematical induction and recursion, two extremely important concepts in computer science. Proofs by mathematical induction, basically, are proofs that show that a property is true of the first natural number (positive integer in this book), and if it is true of an integer k it is true of k+1. With these two results, we can conclude that the property is true of all natural numbers (positive integers). The book then goes on to introduce strong induction and recursively defined functions and sets. From this, the book then goes on to introduce the concept of structural induction, which is a generalization of induction to work on recursively-defined sets. Then, the book introduces recursive algorithms, most notably the merge sort, before giving a high level overview of program verification techniques.
Counting
The book now changes subjects to talk about basic counting techniques, such as the product rule and the sum rule, before (interestingly) moving on to the pigeonhole principle. It then moves on to permutations and combinations, while introducing the notion of combinatorial proof, which is when we show that two sides of the identity count the same things but in different ways, or that there exists a bijection between the sets being counted on either side. The textbook then introduces binomial coefficients, Pascal’s triangle, and permutations/combinations with repetition. Finally, it gives algorithms that generate all the permutations and combinations of a set of n objects.
Compared to other sections, I feel that a higher proportion of readers would be familiar with the results of this chapter and the one on discrete probability that follows it. Other than the last section, which I found quite interesting but not particularly useful, I felt like I barely got anything from the chapter.
Discrete Probability
In this section the book covers probability, a topic that most of LessWrong should be quite familiar with. Like most introductory textbooks, it begins by introducing the notion of sample spaces and events as sets, before defining probability of an event E as the ratio of the cardinality of E to the cardinality of S. We are then introduced to other key concepts in probability theory: conditional probabilities, independence, and random variables, for example. The textbook takes care to flesh out this section with a discussion about the Birthday Problem and Monte Carlo algorithms. Afterwards, we are treated to a section on Bayes theorem, with the canonical example of disease testing for rare diseases and a less-canonical-but-still-used-quite-a-lot example of Naïve Bayes spam filters. The chapter concludes by introducing the expected value and variances of random variables, as well as a lot of key results (linearity of expectations and Chebyshev’s Inequality, to list two). Again, aside from the applications, most of this stuff is quite basic.
Advanced Counting Techniques
This chapter, though titled “advanced counting techniques”, is really just about recurrences and the principle of inclusion-exclusion. As you can tell by the length of this section, I found this chapter quite helpful nevertheless.
We begin by giving three applications of recurrences: Fibonacci’s “rabbit problem”, the Tower of Hanoi, and dynamic programming. We’re then shown how to solve linear homogenous relations, which are relations of the form
an = c1 an-1 + c2 an-2 + … + ck an-k+ F(n)
Where c1, c2, …, ck are constants, ck =/= 0, and F(n) is a function of n. The solutions are quite beautiful, and if you’re not familiar with them I recommend looking them up. Afterwards, we’re introduced to divide-and-conquer algorithms, which are recursive algorithms that solve smaller and smaller instances of the problem, as well as the master method for solving the recurrences associated with them, which tend to be of the form
f(n) = a f(n/b) + cnd
After these algorithms, we’re introduced to generating functions, which are yet another way of solving recurrences.
Finally, after a long trip through various recurrence-solving methods, the textbook introduces the principle of inclusion-exclusion, which lets us figure out how many elements are in the union of a finite number of finite sets.
Relations
Finally, 7 chapters after the textbook talks about functions, it finally gets to relations. Relations are defined as sets of n-tuples, but the book also gives alternative ways of representing relations, such as matrices and directed graphs for binary relations. We’re then introduced to transitive closures and Warshall’s algorithm for computing the transitive closure of a relation. We conclude with two special types of relations: equivalence relations, which are reflexive, symmetric, and transitive; and partial orderings, which are reflexive, anti-symmetric, and transitive.
Graphs
After being first introduced to directed graphs as a way of representing relations in the previous chapter, we’re given a much more fleshed out treatment in this chapter. A graph is defined as a set of vertices and a set of edges connecting them. Edges can be directed or undirected, and graphs can be simple graphs (with no two edges connecting the same pair of vertices) or multigraphs, which contain multiple edges connecting the same pair of vertices. We’re then given a ton of terminology related to graphs, and a lot of theorems related to these terms. The treatment of graphs is quite advanced for an introductory textbook – it covers Dijkstra’s algorithm for shortest paths, for example, and ends with four coloring. I found this chapter to be a useful review of a lot of graph theory.
Trees
After dealing with graphs, we move on to trees, or connected graphs that don’t have cycles. The textbook gives a lot of examples of applications of trees, such as binary search trees, decision trees, and Huffman coding. We’re then presented with the three ways of traversing a tree – in-order, pre-order, and post-order. Afterwards, we get to the topic of spanning trees of graphs, which are trees that contain every vertex in the graph. Two algorithms are presented for finding spanning trees – depth first search and breadth first search. The chapter ends with a section on minimum spanning trees, which are spanning trees with the least weight. Once again we’re presented with two algorithms for finding minimum spanning trees: Prim’s Algorithm and Kruskal’s algorithm. Having never seen either of these algorithms before, I found this section to be quite interesting, though they are given a more comprehensive treatment in most introductory algorithms textbooks.
Boolean Algebra
This section introduces Boolean algebra, which is basically a set of rules for manipulating elements of the set {0,1}. Why is this useful? Because, as it turns out, Boolean algebra is directly related to circuit design! The textbook first introduces the terminology and rules of Boolean algebra, and then moves on to circuits of logic gates and their relationship with Boolean functions. We conclude with two ways to minimize the complexity of Boolean functions (and thus circuits) – Karnaugh Maps and the Quine-McCluskey Method, which are both quite interesting.
Modeling Computation
This is the chapter of Rosen that I’m pretty sure isn’t covered by most introductory textbooks. In many ways, it’s an extremely condensed version of the first couple chapters of a theory of computation textbook. It covers phase structure grammars, finite state machines, and closes with Turing machines. However, I found this chapter a lot more poorly motivated than the rest of the book, and also that Sipser’s Introduction to the Theory of Computation offers a lot better introduction to these topics.
Who should read this?
If you’re not familiar with discrete mathematics, this is a great book that will get you up to speed on the key concepts, at least to the level where you’ll be able to understand the other textbooks on MIRI’s course list. Of the three textbooks I’m familiar with that cover discrete mathematics, I think that Rosen is hands down the best. I also think it’s quite a “fun” textbook to skim through, even if you’re familiar with some of the topics already.
However, I think that people familiar with the topics probably should look for other books, especially if they are looking for textbooks that are more concise. It might also not be suitable if you’re already really motivated to learn the subject, and just want to jump right in. There are a few topics not normally covered in other discrete math textbooks, but I feel that it’s better to pick up those topics in other textbooks.
What should I read?
In general, the rule for the textbook is: read the sections you’re not familiar with, and skim the sections you are familiar with, just to keep an eye out for cool examples or theorems.
In terms of chapter-by-chapter, chapters 1 and 2 seem like they’ll help if you’re new to mathematics or proofs, but probably can be skipped otherwise. Chapter 3 is pretty good to know in general, though I suspect most people here would find it too easy. Chapters 4 through 12 are what most courses on discrete mathematics seem to cover, and form the bulk of the book – I would recommend skimming them once just to make sure you know them, as they’re also quite important for understanding any serious CS textbook. Chapter 13, on the other hand, seems kind of tacked on, and probably should be picked up in other textbooks.
Final Notes
Of all the books on the MIRI research guide, this is probably the most accessible, but it is by no means a bad book. I’d highly recommend it to anyone who hasn’t had any exposure to discrete mathematics, and I think it’s an important prerequisite for the rest of the books on the MIRI research guide.