I was having a bit of trouble holding the point of quadratic residues in my mind. I could effortfully recite the definition, give an example, and walk through the broad-strokes steps of proving quadratic reciprocity. But it felt fake and stale and memorized.
Alex Mennen suggested a great way of thinking about it. For some odd prime p, consider the multiplicative group (Z/pZ)×. This group is abelian and has even order p−1. Now, consider a primitive root / generator g. By definition, every element of the group can be expressed as ge. The quadratic residues are those expressible by e even (this is why, for prime numbers, half of the group is square mod p). This also lets us easily see that the residual subgroup is closed under multiplication by g2 (which generates it), that two non-residues multiply to make a residue, and that a residue and non-residue make a non-residue. The Legendre symbol then just tells us, for a=ge, whether e is even.
Now, consider composite numbers n whose prime decomposition only contains 1 or 0 in the exponents. By the fundamental theorem of finite abelian groups and the chinese remainder theorem, we see that a number is square mod n iff it is square mod all of the prime factors.
I’m still a little confused about how to think of squares mod pe.
The theorem: where k is relatively prime to an odd prime p and n<e, k⋅pn is a square mod pe iff k is a square mod p and n is even.
The real meat of the theorem is the n=0 case (i.e. a square mod p that isn’t a multiple of p is also a square mod pe. Deriving the general case from there should be fairly straightforward, so let’s focus on this special case.
Why is it true? This question has a surprising answer: Newton’s method for finding roots of functions. Specifically, we want to find a root of f(x):=x2−k, except in Z/peZ instead of R.
To adapt Newton’s method to work in this situation, we’ll need the p-adic absolute value on Z: |k⋅pn|p:=p−n for k relatively prime to p. This has lots of properties that you should expect of an “absolute value”: it’s positive (|x|p≥0 with = only when x=0), multiplicative (|xy|p=|x|p|y|p), symmetric (|−x|p=|x|p), and satisfies a triangle inequality (|x+y|p≤|x|p+|y|p; in fact, we get more in this case: |x+y|p≤max(|x|p,|y|p)). Because of positivity, symmetry, and the triangle inequality, the p-adic absolute value induces a metric (in fact, ultrametric, because of the strong version of the triangle inequality) d(x,y):=|x−y|p. To visualize this distance function, draw p giant circles, and sort integers into circles based on their value mod p. Then draw p smaller circles inside each of those giant circles, and sort the integers in the big circle into the smaller circles based on their value mod p2. Then draw p even smaller circles inside each of those, and sort based on value mod p3, and so on. The distance between two numbers corresponds to the size of the smallest circle encompassing both of them. Note that, in this metric, 1,p,p2,p3,... converges to 0.
Now on to Newton’s method: if k is a square mod p, let a be one of its square roots mod p. |f(a)|p≤p−1; that is, a is somewhat close to being a root of f with respect to the p-adic absolute value. f′(x)=2x, so |f'(a)|p=|2a|p=|2|p⋅|a|p=1⋅1=1; that is, f is steep near a. This is good, because starting close to a root and the slope of the function being steep enough are things that helps Newton’s method converge; in general, it might bounce around chaotically instead. Specifically, It turns out that, in this case, |f(a))|p<|f'(a)|p is exactly the right sense of being close enough to a root with steep enough slope for Newton’s method to work.
Now, Newton’s method says that, from a, you should go to a1:=a−f(a)f'(a)=a−a2−k2a. 2a is invertible mod pe, so we can do this. Now here’s the kicker: f(a1)=(a−a2−k2a)2−k=a2−2a(a2−k2a)+(a2−k2a)2−k=(a2−k)2(2a)2, so |f(a1)|p=|a2−k|2p|2a|2p=|a2−k|2p<|a2−k|p. That is, a1 is closer to being a root of f than a is. Now we can just iterate this process until we reach ai with |f(ai)|p≤p−e, and we’ve found our square root of k mod pe.
Exercise: Do the same thing with cube roots. Then with roots of arbitrary polynomials.
The part about derivatives might have seemed a little odd. After all, you might think, Z is a discrete set, so what does it mean to take derivatives of functions on it. One answer to this is to just differentiate symbolically using polynomial differentiation rules. But I think a better answer is to remember that we’re using a different metric than usual, and Z isn’t discrete at all! Indeed, for any number k, limn→∞k+pn=k, so no points are isolated, and we can define differentiation of functions on Z in exactly the usual way with limits.
I was having a bit of trouble holding the point of quadratic residues in my mind. I could effortfully recite the definition, give an example, and walk through the broad-strokes steps of proving quadratic reciprocity. But it felt fake and stale and memorized.
Alex Mennen suggested a great way of thinking about it. For some odd prime p, consider the multiplicative group (Z/pZ)×. This group is abelian and has even order p−1. Now, consider a primitive root / generator g. By definition, every element of the group can be expressed as ge. The quadratic residues are those expressible by e even (this is why, for prime numbers, half of the group is square mod p). This also lets us easily see that the residual subgroup is closed under multiplication by g2 (which generates it), that two non-residues multiply to make a residue, and that a residue and non-residue make a non-residue. The Legendre symbol then just tells us, for a=ge, whether e is even.
Now, consider composite numbers n whose prime decomposition only contains 1 or 0 in the exponents. By the fundamental theorem of finite abelian groups and the chinese remainder theorem, we see that a number is square mod n iff it is square mod all of the prime factors.
I’m still a little confused about how to think of squares mod pe.
The theorem: where k is relatively prime to an odd prime p and n<e, k⋅pn is a square mod pe iff k is a square mod p and n is even.
The real meat of the theorem is the n=0 case (i.e. a square mod p that isn’t a multiple of p is also a square mod pe. Deriving the general case from there should be fairly straightforward, so let’s focus on this special case.
Why is it true? This question has a surprising answer: Newton’s method for finding roots of functions. Specifically, we want to find a root of f(x):=x2−k, except in Z/peZ instead of R.
To adapt Newton’s method to work in this situation, we’ll need the p-adic absolute value on Z: |k⋅pn|p:=p−n for k relatively prime to p. This has lots of properties that you should expect of an “absolute value”: it’s positive (|x|p≥0 with = only when x=0), multiplicative (|xy|p=|x|p|y|p), symmetric (|−x|p=|x|p), and satisfies a triangle inequality (|x+y|p≤|x|p+|y|p; in fact, we get more in this case: |x+y|p≤max(|x|p,|y|p)). Because of positivity, symmetry, and the triangle inequality, the p-adic absolute value induces a metric (in fact, ultrametric, because of the strong version of the triangle inequality) d(x,y):=|x−y|p. To visualize this distance function, draw p giant circles, and sort integers into circles based on their value mod p. Then draw p smaller circles inside each of those giant circles, and sort the integers in the big circle into the smaller circles based on their value mod p2. Then draw p even smaller circles inside each of those, and sort based on value mod p3, and so on. The distance between two numbers corresponds to the size of the smallest circle encompassing both of them. Note that, in this metric, 1,p,p2,p3,... converges to 0.
Now on to Newton’s method: if k is a square mod p, let a be one of its square roots mod p. |f(a)|p≤p−1; that is, a is somewhat close to being a root of f with respect to the p-adic absolute value. f′(x)=2x, so |f'(a)|p=|2a|p=|2|p⋅|a|p=1⋅1=1; that is, f is steep near a. This is good, because starting close to a root and the slope of the function being steep enough are things that helps Newton’s method converge; in general, it might bounce around chaotically instead. Specifically, It turns out that, in this case, |f(a))|p<|f'(a)|p is exactly the right sense of being close enough to a root with steep enough slope for Newton’s method to work.
Now, Newton’s method says that, from a, you should go to a1:=a−f(a)f'(a)=a−a2−k2a. 2a is invertible mod pe, so we can do this. Now here’s the kicker: f(a1)=(a−a2−k2a)2−k=a2−2a(a2−k2a)+(a2−k2a)2−k=(a2−k)2(2a)2, so |f(a1)|p=|a2−k|2p|2a|2p=|a2−k|2p<|a2−k|p. That is, a1 is closer to being a root of f than a is. Now we can just iterate this process until we reach ai with |f(ai)|p≤p−e, and we’ve found our square root of k mod pe.
Exercise: Do the same thing with cube roots. Then with roots of arbitrary polynomials.
The part about derivatives might have seemed a little odd. After all, you might think, Z is a discrete set, so what does it mean to take derivatives of functions on it. One answer to this is to just differentiate symbolically using polynomial differentiation rules. But I think a better answer is to remember that we’re using a different metric than usual, and Z isn’t discrete at all! Indeed, for any number k, limn→∞k+pn=k, so no points are isolated, and we can define differentiation of functions on Z in exactly the usual way with limits.