The absolute basics of representation theory of finite groups

This will be an “ML-oriented” introduction to representation theory of finite groups. It is an introductory sketch that orients towards both a language and a result (the “real semisimplicity” result) that is useful for thinking about the subject in an interpretability context.

This is a somewhat low-effort math post (subbing in for a planned post I’m running late on). It isn’t a prerequisite for anything, but will be useful background for a future discussion of how NN’s learn associative group multiplication (a joint project with Louis Jaburi). I’m hoping for this to be readable by people with a very small amount of background in abstract algebra: it is meant to fill in some very basic knowledge gaps. I’d be very interested in comments about what is confusing or difficult about this post, to help focus on the right exposition directions when writing about this in the future.

I’m not going to assume much abstract algebra, other than the notions of a group and an algebra (always assumed to be a unital, associative algebra).

Algebras of note

Recall that all the set of (real-valued) matrices forms an algebra (it’s a vector space with an associative product that distributes across addition). For any field F, there is a notion of a matrix algebra over F and is naturally an algebra over . The set of complex-valued matrices forms an algebra over , but it can also be viewed as an algebra over , in the same way as can be viewed as a two-dimensional vector space.

We will be interested in understanding the “algebra” associated to a finite group G. It turns out that working with algebras over is easier than working with algebras over from a theoretical viewpoint (this is of course because is algebraically closed, i.e., any polynomial equation has a solution. We’ll start out looking at such representations. However unfortunately, neural nets don’t care about theoretical ease: instead they care about compression and efficiency, and this will necessitate also understanding a bit of representation theory over .

For starters, we work over . We’ll need one more useful algebra that will be one of the main characters in our discussion: namely, the group algebra of G, written . This is the algebra whose elements are formal complex linear combinations of elements of , and where product is induced from the product on distributively. We analogously have the real-valued group algebra . When we don’t want to commit to either real or complex coefficients and want to treat them simultaneously, we’ll write for F one of our fields or (a bit of the analysis here carries over to more general fields, though usually one should require the characteristic of the field to be relatively prime to the order of the group for the theory to have comparable niceness).

So for example if is the cyclic group on n elements ( is better notation for the group that is sometimes written or ), then an element of is a sum Here the “bolded” numbers are group elements. Note of course that since we’re viewing it as a group, multiplication in corresponds to addition of numbers: i.e., (This is another reason why is better notation—since could also mean the ring, multiplicative and additive notions can be more confusing.) To see how group multiplication in this algebra works, let’s work out an example with : (Here remember that we’re writing the group multiplicatively, so for example since .)

Representations

Now for the purposes of this note we define an n-dimensional representation of to be a map of algebras from the group algebra to a matrix algebra: Here F can either be the field (“real-valued representations”) or the field (“complex-valued representations”).

Of course by linearity, it’s enough to specify for elements , which are the standard basis for the group algebra: i.e., can be recovered from its restriction . Conversely, any map leads to a map of algebras if it’s

  • Unital, i.e. and

  • A multiplicative homomorphism, i.e., (product of matrices on the RHS).

Note that setting and using the homomrphism property tells us that must always be invertible, i.e., an element of the group of invertible matrices ; this shows that our notion of representation is equivalent to the usual notion of group homomorphism . We say that two representations are equivalent if

  • they have the same dimension (i.e., both n-dimensional for the same n) and

  • they are conjugate: so there exists some other invertible matrix such that, for all , we have the identity (if you haven’t seen expressions like this, convince yourself that the formula on the right defines a new representation of for any fixed invertible .) In this case we think of and to act “via the same symmetries”, just expressed in different bases (the standard basis in the case of and the basis given by in the case of ).

If you’ve ever heard of group representations, you’ve heard of the notion of irreducibility. A representation is irreducible if it cannot be decomposed into smaller components: i.e., if it isn’t conjugate to a block-diagonal representation of the form

for nonzero representations.

The “reducible” representation above (check this is a representation if you don’t believe me!) is denoted .

Now the big secret point of representation theory is that the set of irreducible representations (up to isomorphism) “contain all the algebraic information” about G. This is encompassed by a series of results; an appealing consequence of these results (and a way of condensing much of the information contained in them) is the semisimplicity theorem below.

Let be a collection of representations that contains each irreducible representation exactly once up to conjugacy. (Usually, since conjugacy, i.e., “isomorphism”, simply corresponds to choosing a different basis, people will use “sum of all irreducible representations” to mean this object). Here are the dimensions of the irreducible representations. The classification of (complex) representations of G and the eigenvalues of the matrices is called character theory (something that gets covered in a first course on representation theory, but that we won’t get into here). Character theory in particular implies that the number (k) of distinct irreducible complex representations is equal to the number of conjugacy classes of G, i.e., the number of elements on the set obtained from G by identifying and for any G. If G is commutative, then each conjugacy class just has a single element (each element g above only gets identified with itself), so in the special case of commutative groups (also known as “abelian” groups), the number of conjugacy classes is equal to the number of elements, i.e., .

Theorem (semisimplicity of the group algebra):

be the direct sum of all irreducible representations of G. Then the map

is injective, and identifies with the set of all block diagonal matrices

(Here the are complex-valued matrices). In other words if we only care about the algebra , then all of its structure is fully determined by the (unordered) collection of irreducible representation dimensions . Since there are a lot of groups, this implies that there are some pairs of distinct groups G, H that have isomorphic matrix algebras! I.e., while the groups cannot be identified with each other, every element of G can be identified with a linear combination of elements of H in a way that respects multiplication and admits an inverse (once again linear combination-valued): indeed, if the representation dimensions of G and H coincide, then as algebras, we can identify both group algebras with the same algebra of block-diagonal matrices. A simple example of a pair of distinct groups with isomorphic group algebras are the groups and [1].

Real reprsentation theory, and “representation type”.

But once again, if we want to understand how NN’s process groups, we need to understand the “more efficient” representation theory over . Here we can start our analysis by looking at the same picture as before: namely, while we have no guarantee that the matrix coefficients of a matrix in a complex representation has real coefficients, we do have a real subring , namely real-valued formal linear combinations of group elements viewed as a subring of complex-valued ones. Since each complex linear combination can be uniquely identified with a unique block-diagonal matrix we want to understand what kinds of subrings we can obtain. This must be a “decomplexification” (more standardly called “real form”) of , a subring which has the following properties:

  • is closed under multiplication by real scalars (but not necessarily complex scalars),

  • has half the dimension of the real vector space .

  • each block-diagonal matrix can be expressed as for in .

Of course, one “real form” is the set of block-diagonal matrices as above with real coefficients, and sometimes this does correctly capture the image of the real group algebra (this is the case, for example, for the symmetric group ). However, there exist other possibilities for the real form. The property of being “real” (i.e., having real coefficients as a formal linear combination of group elements) as a group algebra vector may not be compatible (even after allowing a fixed conjugation) with the property of being “real” (i.e., having real coefficients as a matrix) as a block-diagonal matrix. In other words, the group G (and its real group algebra) “chooses” a specific real form of the algebra of block diagonal matrices.

Such real forms can be fully classified, and there are exactly three things that can happen to some block

  1. It gets replaced by (a conjugate of) Irreducible representations for which this happens for the corresponding block are said to be of “real type” (note that being of real type is a property of a complex-valued representation!).

  2. It gets combined with another block and (again up to conjugating by some invertible matrix ), a block-diagonal matrix spanning these two blocks is “real” if and only if A and B are (coordinatewise) complex conjugates: i.e., coordinatewise, . Pairs of representations for which this happens are said to be of “complex type”, and they always satisfy (after possibly replacing one of by a conjugate).

  3. The dimension n of the corresponding block is even and, up to conjugation, the matrices in the real form associated to this block can be split into blocks:

    where each above is a complex-valued quaternion matrix, i.e., a matrix of the form

    = a1 + bi + ci + dj, where are real numbers and 1 = is the usual unit matrix, and we have

    i = ,

    j = and

    k =

    (These matrices are known in physics as the Pauli matrices.)

    The resulting 4-dimensional real subalgebra of is called the quaternion algebra, denoted . It has the wonderful property of being the only (finite-dimensional) division algebra over (i.e. algebra all of whose nonzero elements are invertible) except for and itself, and the only non-commutative one. Representations whose associated block exhibits this behavior are said to be of “quaternionic type”.

Together, these results imply that whereas the complex group algebra is always (isomorphic to) a direct sum of matrix algebras (this expression, with the symbol denoting “direct sum”, is the standard mathematical notation for the block-diagonal algebra), the real group algebra is isomorphic to a direct sum of matrix algebras of three forms:

  1. Real matrix algebras (each is associated to a single complex representation of dimension n).

  2. Complex matrix algebras (but viewed as an algebra over ). These components of the real-valued group algebra “combine” two complex representations, so there are half as many such components as there are irreducible (complex-valued) representations of complex type.

  3. Quaternionic matrix algebras (each is associated to a single complex representation of dimension 2n).

Examples

We conclude by giving some explicit examples of the comparison between the group algebra the block-diagonal matrix algebra associated to irreducible representations.

Complex representations of cyclic groups

Let be the cyclic group on n elements. In this case it turns out each irreducible representation of G is one-dimensional and there are n of these. By our semisimplicity result above, this is equivalent to the fact that the group algebra is isomorphic to the algebra of diagonal complex matrices. To see this it is enough to construct an isomorphism.

Let be the elementary nth root of unity. For any residue \[a\in C_n\], we write

Here in order to make sense of where a is a residue (rather than an integer), we’re implicitly using the fact that (and so exponents of only depend on their residue mod n). The mapping above is obviously a homomorphism (i.e., compatible with multiplicaiton). It’s a standard fact that the above diagonal matrices are linearly independent as a ranges over the group elements. Since both the algebra of diagonal matrices is n-dimensional, these n linearly independent vectors must form a basis; since we have identified a basis of the group algebra with a basis of the block-diagonal matrix algebra, this is an isomorphism. If we look at the diagonal matrix associated to the linear combination its kth element is Looking at this formula, we see that we have rediscovered the finite Fourier transform: indeed, if we view the coefficients as a complex-valued function on \the group elements , then the (k,k)th matrix coefficient of the associated diagonal matrix is the Fourier transform . The homomorphism property in this case is equivalent to observing that the convolution product (which is equivalent to the group algebra product—checking this is a nice exercise) maps to the elementwise product after Fourier transform (and of course multiplying diagonal matrices multiplies their coefficients elementwise).

Real representations of cyclic groups

Here there is a difference between even and odd n. The results of the previous section tell us that there are n different complex one-dimensional representations (with the kth representation corresponding to taking an element of the group algebra to the kth diagonal element). A one-dimensional representation just takes each element of to an (invertible) element and is given by the formula For all n, the representation is of real type: indeed, it takes every element of G to 1, which is a real-valued matrix. If n is even, then the representation is also real-valued, as it takes each element a to . And whether n is even or odd, all other are of complex type: we can check this simply by seeing that is not isomorphic to the complex conjugate representation . (Note since two matrix representations are isomorphic if and only if they’re conjugate by some fixed matrix , any 1-dimensional representation will be conjugate only to itself. Thus it’s enough to observe that for , to see that is of complex type. Thus semisimplicity in the real case tells us that the group algebra is isomorphic

  • if n is odd and

  • if n is even.

(Here an “exponent” d is the direct sum expression is shorthand for taking the direct sum of d copies of the same component).

The group of elementary quaternions

Unspurprisingly (since the quaternions are a noncommutative division ring), no commutative group has representations of quaternionic type. The smallest group with a representation of quaternionic type is the group of elementary quaternions, with the 8 elements 1, −1, i, -i, j, -j, k, -k. We can identify each of these elements with the corresponding “Pauli” matrix above, and then they form a group with identity 1 and the following multiplication table:

  • (-1) for one of the symbols 1, i, j, k;

  • i^2 = j^2 = k^2 = −1;

  • ij = k, ji = -k, jk = i, kj = -i, ki = j, ik = -j.

This group has one 2-dimensional complex representation of quaternionic type, thus giving a map (remember that the quaternionic dimension is half the complex dimension). This is the obvious map, taking each element of the group to the corresponding quaternion; it also has 4 different 1-dimensional real representations: one is (as for any group) the trivial representation, taking every to 1, and the other three factoring through one of three surjective homomorphisms which ignore sign and take 2 of the three letters i, j, k to −1. There are no other representations, so we see that the real group algebra is isomorphic to the direct sum In particular, has no representations of complex type.

  1. ^

    In fact, these two algebras have the same group algebras over , which is a stronger result. Similarly, the Klein 4-group has the same complex group algebra as the cyclic group , but these have nonisomorphic group algebras over .