The theorem proved here is that if d : square matrices → numbers does what the determinant does to permutation matrices and is “linear on rows”, then d is precisely the determinant. (Or, equivalently: if d is alternating and linear on rows, then it is precisely the determinant.)
Since the motivation for the first condition is that you want d(AB) = d(A) d(B), it may be worth pointing out that it’s also true that if d(AB) = d(A) d(B) and d is “linear on rows” then d is either identically 0 or precisely the determinant.
You can’t do this by saying that if d(AB) = d(A) d(B) for permutation matrices then it has to be the same as the determinant for those, because that isn’t true: you could have d(A)=1 whenever A is a permutation matrix. Also, you’d need a proof of something stated but not proved in the OP, namely that the only ways to map permutations to numbers multiplicatively are always-1 and what the determinant does. (Though that’s pretty easy.)
Anyway, here’s a sketch of how to prove that multiplicativity + row-linearity ⇒ being the determinant.
First, consider the matrix C that you get by starting with the identity and then moving one of the 1s on its diagonal to a different place on that row. Premultiplication by this matrix is the operation of copying one row on top of another. We must have d(C)=0, because for any matrix A we have CA = CB where B is what you get by replacing the “copied-onto” row of A with all-zeros, and d(B)=0 by row-linearity, so d(C)d(A)=0 whatever A is, so either d maps everything to 0 or d(C)=0. In either case d(C) = 0.
Any matrix with two equal rows is CA for some A, so any matrix with two equal rows maps to 0.
So now take any matrix A, pick two rows, and write f(u,v) for what you get when you overwrite those two rows of A with u and v respectively and apply d to the result. By row-linearity we have f(u+v,u+v) = f(u,u) + f(u,v) + f(v,u) + f(v,v). But f(w,w)=0 for any w, so this says f(u,v) + f(v,u) = 0: swapping two rows changes the sign of d.
Now, d(identity) is its own square, so is either 0 or 1. If it’s 0 then d is identically zero. Otherwise, d(identity)=1; then d(T)=-1 where T is the permutation matrix for any transposition; any permutation is a product of transpositions, so if P is any permutation matrix that’s a product of k transposition, d(P) = (-1)^k. In other words, d must do to permutation matrices the same thing that the determinant does. And now we can apply the argument in the OP.
The theorem proved here is that if d : square matrices → numbers does what the determinant does to permutation matrices and is “linear on rows”, then d is precisely the determinant. (Or, equivalently: if d is alternating and linear on rows, then it is precisely the determinant.)
Since the motivation for the first condition is that you want d(AB) = d(A) d(B), it may be worth pointing out that it’s also true that if d(AB) = d(A) d(B) and d is “linear on rows” then d is either identically 0 or precisely the determinant.
You can’t do this by saying that if d(AB) = d(A) d(B) for permutation matrices then it has to be the same as the determinant for those, because that isn’t true: you could have d(A)=1 whenever A is a permutation matrix. Also, you’d need a proof of something stated but not proved in the OP, namely that the only ways to map permutations to numbers multiplicatively are always-1 and what the determinant does. (Though that’s pretty easy.)
Anyway, here’s a sketch of how to prove that multiplicativity + row-linearity ⇒ being the determinant.
First, consider the matrix C that you get by starting with the identity and then moving one of the 1s on its diagonal to a different place on that row. Premultiplication by this matrix is the operation of copying one row on top of another. We must have d(C)=0, because for any matrix A we have CA = CB where B is what you get by replacing the “copied-onto” row of A with all-zeros, and d(B)=0 by row-linearity, so d(C)d(A)=0 whatever A is, so either d maps everything to 0 or d(C)=0. In either case d(C) = 0.
Any matrix with two equal rows is CA for some A, so any matrix with two equal rows maps to 0.
So now take any matrix A, pick two rows, and write f(u,v) for what you get when you overwrite those two rows of A with u and v respectively and apply d to the result. By row-linearity we have f(u+v,u+v) = f(u,u) + f(u,v) + f(v,u) + f(v,v). But f(w,w)=0 for any w, so this says f(u,v) + f(v,u) = 0: swapping two rows changes the sign of d.
Now, d(identity) is its own square, so is either 0 or 1. If it’s 0 then d is identically zero. Otherwise, d(identity)=1; then d(T)=-1 where T is the permutation matrix for any transposition; any permutation is a product of transpositions, so if P is any permutation matrix that’s a product of k transposition, d(P) = (-1)^k. In other words, d must do to permutation matrices the same thing that the determinant does. And now we can apply the argument in the OP.