In high-school mathematics you've probably come across the term combinatorics. In its most basic form, combinatorics is about counting the number of ways you can arrange some objects in a particular fashion. For example, to arrange n (distinct) objects in a line, there are n!=n(n−1)…2×1 possible permutations. If there are repeated objects, we divide by the permutations of the identical objects. For example arranging the letters of BALLOON in a line, the letters L and O both repeat twice, so there are 2!2!7! permutations. On the other hand, if we wanted to arrange n (distinct) objects in a circle, each arrangement can be rotated and it is considered "identical". So we arrange the items in a line and divide by the number of different rotations of these objects (to remove the overcounted arrangements) n!/n=(n−1)!.
What happens when we combine these ideas together? Say we take take n objects, where some are repeating, and arrange them in a circle. An easy example: arrange the letters AABB in a circle. How many ways can this be done? It's easy to write down every possibility, and we find that there are only two (considering rotation as equivalent):
However if we try to combine the two equations we used above, the number of permutations comes out to be
4×2!2!4!=23=1.5.
Not only is this not 2, but isn't even a whole number! We could try many different examples and attempt formulate a nice equation, but it's not exactly that simple! It turns out there is a nice theorem that gives a general procedure to solve this, but it requires some background in group theory.
tl;dr
In mathematics, Group theory is the study of symmetries, how they interact with each other, and how they interact with other objects. There is a theorem of group that counts the number of "different" elements in a set, given that elements are "not different" when are related by a symmetry (more precisely, a group action). We can apply this to concrete counting problems, by looking at e.g. rotations acting on objects arranged in a circle. This theorem can be extended to extract more information about arrangements, where the calculations are reduced to looking at coefficients of a polynomial.
Background
The following is a quick and brief proof-less introduction to the concepts we will need. The intention is to be a refresher for those who have seen groups before, however, if you are new to these concepts, there should be barely enough to understand (with time) how the counting theorem works.
Groups
Definition. A group is a set G with a binary operation ∗, such that the following axioms hold.
(Identity) There is an element e∈G such that for any a∈G, a∗e=e∗a=a.
(Associativity) For any three elements a,b,c∈G, (a∗b)∗c=a∗(b∗c).
(Inverse) For any element a∈G, there exists an element b∈G for which a∗b=b∗a=e, we write a−1 for this element.
We sometimes drop the ∗, and just write gh=g∗h, if the operation is clear from the context.
Example 1. An easy example of a group is Z/nZ={0,1,...,n−1} where the operation is addition modulo n. This is the cyclic groupCn. We can think of this as generated by 1, for example Z/4Z={1,1+1,1+1+1,1+1+1+1}, where 1+1+1+1=0(mod4).
Example 2. The symmetric group on n elements Sn is a group where elements are the permutations of 1,2,...,n. For two elements a,b∈Sn, ab∈Sn is the permutation obtained by applying b then applying a. For example the following table represents a permutation on the first 4 integers, taking 1→2→3→1 and 4→4.
Input
1
2
3
4
Output
2
3
1
4
A useful way to write this is with cycle notation. The above table corresponds to the element (123)(4)∈S4, where (123) and (4) are cycles. In fact every permutation of n elements can be written as a product of disjoint cycles (i.e. the numbers inside the cycles don't overlap). We often do not write cycles of length 1 because they offer no new information. For convenience, we write the identity permutation (1)(2)...(n) as id.
The counting theorem uses subgroups of Sn, which you can think of as smaller groups living inside Sn. The group Cn, from Example 1, actually appears as a subset of Sn, for example whose elements are powers of (123...n) (generated by this cycle). From now on, this is how we will think about Cn.
Group Actions
Definition. Let S be a set and G a group. A group action of G on S is an operation g⋅s∈S for g∈G and s∈S, satisfying the following properties.
(Identity) If e∈G is the identity element, then e⋅s=s for all s∈S.
(Compatibility) For g,h∈G, g⋅(h⋅s)=(gh)⋅s for all s∈S.
Example. Let G=S4 and let S be the set tuples representing arrangements of the letters a,b,c,d in a line (without repetition). An easy group action could be for g∈S4 and s∈S, g⋅s is permuting entries in the tuple by g. For example if g=(123)(4) and s=(a,b,c,d) then g⋅s=(c,a,b,d), because g take the element in position 1 to position 2, the element in position 2 to position 3, and so on.
Definition. Let G be a group acting on a set S.
Let s∈S. The orbit of s is the set of every possible element in S obtained from G acting on x, that is
G⋅x={g⋅x:g∈G},
which is sometimes written Orb(x). The stabilizer of s is the set of elements of G that fix s, that is
Stab(x)={g∈G:g⋅s=s}.
Let g∈G. The fixed set of g is the set of every element of S that is fixed by g, that is
Sg={s∈S:g⋅s=s},
which is sometimes written Fix(x).
The set of orbits is typically written as S/G={Orb(s):s∈S}.
Example. Let G=C4⊆S4 be the group generated by one 4-cycle (cycle length 4) in S4. Think of C4 as generated by (1234), whose elements are
Element
Cycle Decomposition
(1234)
(1234)
(1234)2
(13)(24)
(1234)3
(1432)
(1234)4
(1)(2)(3)(4)
Let S be the set 4-tuples representing arrangements of the letters a,b,c,d in a line (with repetition). Let C4 act on S as in the previous example. Let s=(a,b,c,c)∈S then the orbit of s is
This theorem by itself can already be used to solve our problem. Remember that we wanted to arrange the letters AABB in a circle. We can take G=C4 to as our group action, so that each arrangement and all its rotations can be summed up as one orbit. We wanted to know how many of these "different" arrangements (or orbits) there are, which Burnside's lemma tells us. If we let S be all the tuples representing arrangements of AABB in a line, then the number of orbits under C4 is the number of "different" arrangements of AABB in a circle.
Listing out all the arrangements of the letters of AABB, we get
S={AABB,ABAB,ABBA,BABA,BAAB,BBAA}.
Since C4 is generated by (1234), its elements are
C4={id,(1234),(13)(24),(1432)}.
For an arrangement to be fixed by a cycle, the letters at each position changed by the cycle must be the same. Then we can see that
Sid=S,
S(1234)=∅,
S(13)(24)={ABAB,BABA} (because an arrangment is fixed by (13)(24) only when the letter at positions 1,3 are equal, and the letters at positions 2,4 are equal),
This is exactly what we wanted! In fact Burnside's lemma is enough for any version of our problem, but we continue further to explore a theorem by Pólya that generalises this.
Before moving on, note that if we let the set S be the set of all 4-element arrangements of the letters A,B (with replacement), then Burnside's lemma tells us the number of ways to arrange A,B on a circle (with replacement). Particularly, orbits of the arrangements of AABB is a subset of the set of orbits S/C4. Pólya's theorem builds on this by describing the number of circle arrangements of every 4-letter combination of A's and B's inside S/C4.
Pólya's Enumeration Theorem
Preliminaries
On any permutation group (on finite elements), we can encapsulate its information in a polynomial.
Definition. Let g∈Sn be a permutation on n elements, and let ci(g) be the number of i-cycles in the disjoint cycle decomposition of g. The cycle index of a permutation group G⊆Sn is the polynomial
This polynomial contains information about the types of elements in the group. For example 2t32 says there are two elements which are the product of two cycles of length 3: g2=(135)(246) and g4=(153)(264). (The number of cycles in the decomposition isn't unrelated to the fact that gcd(6,2)=2=gcd(6,4) but is not too important.)
Definition. Let C and D be sets, then CD is the set of functions f:D→C.
Proposition. If G were a (finite) group acting on D, then there is a G-action on CD defined by
(g⋅f)(d)=f(g−1⋅d)
for any f∈CD and d∈D. We use the convention of naming the sets C and D because they correspond to colouring and domain. So each function f∈CD can be thought of assigning to each element of D a colour in C.
The main feature of Pólya's theorem is that it distinguishes between 'weights' of particular combinations of colours.
Definition. Let w:C→R be a function that assigns to each colour c∈C a weight w(c)∈R. For f∈CD, define the weight of f
W(f)=d∈D∏w(f(d)).
Proposition. Let f1,f2∈CD be in the same G-orbit, then W(f1)=W(f2).
Proof. By definition there exists some g∈G such that f1(x)=(g⋅f2)(x)=f2(g−1⋅x). Then
where the third equality follows from the invertibility of elements in G. Then G-actions are permutations on D, so summing over every element of D is the same as summing over every element of g−1⋅D={g−1⋅d:d∈D}.
Due to this property, we can assign a weight to entire orbits F∈CD/G, where W(F)=W(f) for any f∈F.
Example. Consider the example of arranging 6 objects in a circle, where each object can be one of two letters A or B. (This will also motivate how we will use the weights for the problem of arranging elements in a circle.) Let D={1,2,3,4,5,6} be the positions on the circle and C={A,B} be the letters they could be assigned, and let C6 act on CD as above. Define w so that w(A)=a and w(B)=b, where a,b are numbers in R. Consider the function f1 mapping 1,2,3↦A and 4,5,6↦B, and f2 mapping 1,3,5↦A and 2,4,6↦B.
We have
Notice that, even though they have the same weight, these functions don't belong to the same orbit, since AAABBB and ABABAB are not rotations of each other. So these weights lose information about the functions, but tells us how many times each letter appears, but nothing about their order. This is perfect for what we want!
Theorem
In general context, we think of C as "colours", and f∈CD as colour assignments to elements in D, or "colourings of D". The problem we care about is: How many of colourings have a certain weight?
Theorem. (Pólya's Enumeration Theorem) Let C,D be finite sets, G a finite group acting on D, and w:C∈R be a weight function. Then the pattern inventory is
F∈CD/G∑W(F)=ZG(c∈C∑w(c),c∈C∑w(c)2,...).
The sum ∑F∈CD/GW(F) is sometimes called the (multivariate) generating function for the different colourings of D (the coefficient of each monomial, as a polynomial in the elements of C, is the number of colourings corresponding to the monomial). Note that we can use any group action.
Proof.
Starting with ∑FW(F), consider the possible weights
Ω={ω:W(f)=ω for some f∈CD}.
Recall that the G-action on D induces a G-action on CD, so we can reindex the sum in terms of Ω to get
F∈CD/G∑W(F)=ω∈Ω∑ω∣Sω/G∣,
where Sω={f:D→C∣W(f)=ω, so Sω/G={F∈CD/G:W(F)=ω} is the set of orbits which have weight ω. (Note this uses the induced G-action on CD, which is well defined because G maps orbits to themselves, and function weights are preserved within the orbits.) Using Burnside's Lemma,
Write g as a product of disjoint cycles. If we think of the terms in the expansion of the right hand side as a series of choices: c1(g) choices of c∈C for factors of w(c), c2(g) choices of c∈C for factors w(c)2 etc. Then each term is some product of ∣D∣ weights such that the corresponding colour combination is fixed by g, and its coefficient is how many ways this can be chosen. (This thought process is similar to counting monomials in an expanded binomial to obtain the binomial coefficients.) We can check that these choices are the only ways to get a colouring f fixed by g. Notice that for such a colouring f∈(CD)g, its weight W(f) is defined exactly as its corresponding monomial above. Therefore the right hand product is exactly
This retrieves Burnside's lemma. (This is exactly Burnside's lemma if ∣D∣=1.)
Intuition
Following the proof gives us some intuition into how the right hand polynomial can be interpreted. Let us consider the example of colouring D with colours in C, and suppose the weights are indeterminates C1,...,Cn which uniquely correspond to each colour c1,...,cn respectively.
It is easy to understand the left hand side: each G-orbit of CD corresponds to a particular colouring we want to think of as "identical" when acted on by elements of G. Given an orbit F∈CD/G, W(F) is a product of weights w(c) for some c∈C. So in the sum of W(F) over all orbits F, the coefficient of a monomial C1d1…Cndn (where d1+⋯+dn=∣D∣) will be the number of different ways ∣D∣ can be coloured with d1 of c1, d2 of c2 etc. This tells us the looking at the coefficients of this polynomial gives us the "counting" information we want.
In the proof, we took this left polynomial and grouped it into orbits of the same weight. Doing this, we found that the coefficient of each possible function weight ω is the number of G-orbits of Sω (functions in S with weight ω). Burnside's lemma tells us that this number is just the average size of fixed sets of Sω, so the task of finding the coefficient of ω is reduced to finding the size of fixed sets of Sω.
This information is captured by the coefficients of the product,
(c∈C∑w(c))c1(g)(c∈C∑w(c)2)c2(g)…
where each term is a series of choices to colour parts of D fixed by a cycle in g, altogether the choices correspond to one colouring fixed by g. Collecting the terms of equal weight, we obtain the number of colourings of that weight that is fixed by g.
Note: we also see that Pólya's theorem is essentially just Burnside's lemma applied to each weight.
Example 1
Let's return to our original problem, this time with a slightly harder example: How many different circular arrangements of the letters of XXYYZZ are there?
In Pólya's Enumeration Theorem, let D={1,2,3,4,5,6}, C={X,Y,Z}, and w assign weights X↦x,Y↦y and Z↦z. Let G=C6 be generated by (123456) (we listed its elements in a previous example), and act on D by permuting the elements, then G also acts on CD as described earlier. To get intuition on what this action looks like, consider g=(123456)∈C6 and f∈CD mapping 1,2,3,4,5,6 to X,X,Y,Z,Y,Z respectively. Then
(g⋅f)(1)=f(g−1⋅1)=f(6)=Z
(g⋅f)(2)=f(g−1⋅1)=f(1)=X
(g⋅f)(3)=f(g−1⋅1)=f(2)=X
(g⋅f)(4)=f(g−1⋅1)=f(3)=Y
(g⋅f)(5)=f(g−1⋅1)=f(4)=Z
(g⋅f)(6)=f(g−1⋅1)=f(5)=Y
so g⋅f maps 1,2,3,4,5,6 to Z,X,X,Y,Z,Y. Here (123456) permuted the image of f like it permutes the elements of D: rotating X,X,Y,Z,Y,Z→Z,X,X,Y,Z,Y. This is just how we expect C6 to act on colourings, and is actually true in general. The theorem says we can calculate the right hand polynomial and look at the coefficients. In our case, the polynomial is
(using the values for C6 generated by (123456) we calculated in a previous example). We could endure the suffering of expanding this out, or just look at the coefficients of the terms we want. The following is a useful generalisation of the binomial theorem.
Theorem. (Multinomial Theorem) The coefficient of x1k1…xmkm (where k1+⋯+km=n) in the expansion of (x1+⋯+xm)n is
(k1,…,kmn)=k1!…km!n!.
Proof. Similar to that of binomial theorem.
For our purposes, we consider the coefficient of x2y2z2.
It does not appear in (x6+y6+z6) and (x3+y3+z3)2
This term appears in (x2+y2+z2)3 as (x2)1(y2)1(z2)1 which has coefficient (1,1,13)=3!=6.
It also appears in (x+y+z)6 as x2y2z2 which has coefficients (2,2,26)=2!2!2!6!=90
Therefore the coefficient of x2y2z2 in the pattern inventory is
Remarks. You may have noticed that multinomial coefficients are exactly how we were taught to count the permutations of letters with possibly some repeating. So it makes sense that it comes up when counting arrangements of the same letters in a circle.
Example 2
Here is another example illustrating an easier case. Let D={1,2,3,4,5,6,7}, C={X,Y,Z}, and w assign weights X↦x,Y↦y and Z↦z. Let G=C7 be generated by g=(1234567) so the elements are
Element
Cycle Decomposition
Value of ci(⋅)
g
(1234567)
c7(g)=1
g2
(1357246)
c7(g2)=1
g3
(1473625)
c7(g3)=1
g4
(1526374)
c7(g4)=1
g5
(1642753)
c7(g5)=1
g6
(1765432)
c7(g6)=1
g7
(1)(2)(3)(4)(5)(6)(7)
c1(g7)=7
where all the other values of ci(gj) are 0. As before, this group acts on D by permuting the elements and also acts on CD. Then the pattern inventory is
circular arrangements for Xk1Yk2Zk3 (where k1+k2+k3=7).
Notice that we have this kind of polynomial because all the non-identity cycle decompositions are 7-cycles. In fact this happens exactly when the number of elements is prime.
Proposition. Let p be prime. Then for D where ∣D∣=p, C={A1,...,An}, G=Cp, the pattern inventory is
∣G∣1((p−1)c∈C∑w(c)p+(c∈C∑w(c))p).
That is, there are
p1(k1,...,knp)=k1!...kn!(p−1)!
circular arrangements for c1k1...cnkn, where k1+...+kn=p and 0≤k1,...,kn<p.
This shows that the naïve formula we started with works for non-trivial colourings of a prime number of elements.
Formula
Is there a nice general formula we can extract from this theorem? Not really, but we can pull something together.
Briefly, what we need to know is the following. (The first three points find the pattern inventory, and the last point find the desired coefficient.)
How large is Cm? It has m elements
How do elements of Cm decompose into disjoint cycles? How large are these cycles and how many cycles are there?
If Cm is generated by g=(12...m), then letting d=gcd(m,k) we get that gk is a product of d disjoint m/d-cycles.
How many elements have same cycle size and cycle lengths?
In the context of the previous point, there are Nd=#{k:1≤k≤m s.t. gcd(m,k)=d} cycles 'similar' to gk. This tells us how many times (c1m/d+c2m/d+...)d appears in the pattern inventory.
There is no 'nice' equation for this number (it could be expressed in terms of the Möbius function).
Which of these multinomials has our pattern x1k1x2k2...?
This can be done by checking ∑iki=m and m/d∣ki for all i, for each multinomial (c1m/d+c2m/d+...)d.
So given a valid list of 'colours' x1k1x2k2..., the coefficient in the pattern inventory (i.e. the number of arrangements of these colours) looks something like
m1d∣m(m/d)∣ki,∀i∑Nd(m/dk1,m/dk2,...d).
Additional Comments and Material
We only looked at cyclic groups Cn, but we could use any subgroup of Sn. For example if we think of reflections of our circular arrangements as "identical", then we can use the dihedral group Dn. The orbits under the dihedral group is exactly the distinct "circular arrangements" we want.
The theorem can also be applied for various different counting problems, by changing the weights and subgroup of Sn. For example, Pólya applied this theorem to count the number of isomers of hydrocarbons, which popularised its usage.
This post was based on the following. These are good reading if you want to explore this topic more.