Lecture 2: The Category Theory of Finite Sets

Julia from last lecture
abstract type 𝔽 end

struct Vec𝔽 <: 𝔽

Base.:()(a, A::Vec𝔽) = a  A.elems

Base.iterate(A::Vec𝔽) = iterate(A.elems)
Base.iterate(A::Vec𝔽, k) = iterate(A.elems, k)
Base.length(A::Vec𝔽) = length(A.elems)

struct Int𝔽 <: 𝔽

Base.:()(a, A::Int𝔽) = 1 <= a <= A.n

Base.iterate(A::Int𝔽) = iterate(1:A.n)
Base.iterate(A::Int𝔽, k) = iterate(1:A.n, k)

struct 𝔽Mor

(f::𝔽Mor)(x) = f.vals[x]

struct Int𝔽Mor

(f::Int𝔽Mor)(i) = f.vals[i]

Composition and Isomorphisms

We now introduce the operation at the core of all category theory: composition!

Given finite sets A,B,C and morphisms f \colon A \to B and g \colon B \to C, their composite g \circ f is a morphism from A to C defined by (g \circ f)(a) = g(f(a)) for all a in A. We call \circ the operation of composition, and we say that we compose f and g to form g \circ f.

Julia version
function compose(f::𝔽Mor, g::𝔽Mor)
  @assert f.codom == g.dom
  𝔽Mor(f.dom, g.codom, Dict(a => g(f(a)) for a in f.dom))
A,B,C = Vec𝔽([1,2,3]), Vec𝔽([:a,:b]), Vec𝔽(["s", "t"])
f = 𝔽Mor(A,B,Dict(1=>:a,2=>:a,3=>:b))
g = 𝔽Mor(B,C,Dict(:a=>"t",:b=>"s"))
Dict{Any, Any} with 3 entries:
  2 => "t"
  3 => "s"
  1 => "t"

We now come to an issue that is everywhere in category theory: equality. If you have seen any set theory before, you might think that \{1,2,3,3\} and \{3,2,1\} are “the same” set. Note however that when I defined finite set, I just said that a finite set was a list of things surrounded by curly braces.

Julia version

The equality between two different Vec𝔽 is sensitive to ordering and duplicates.

A, A′ = Vec𝔽([1,2,3,3]), Vec𝔽([3,2,1])
A == A′

So it seems like something is wrong with our definition.

Possible fix in Julia

One way one might try to fix this would be by using Set{Any} instead of Vector{Any}. This certainly makes more of our Julia values equal.

struct Set𝔽

Base.:()(a, A::Set𝔽) = a  A.elems

Base.iterate(A::Set𝔽) = iterate(A.elems)
Base.iterate(A::Set𝔽, k) = iterate(A.elems, k)

But let’s think about why we typically choose to make \{1,2,3,3\} and \{3,2,1\} “the same” set. One good reason is that any morphism out of \{1,2,3,3\} can be seen as a morphism out of \{3,2,1\}, and the same goes for incoming morphisms.

Julia implementation

With this code, we can change the domain of a morphism from A to A′, and vice versa, and the morphism remains valid.

convertAA′(f::𝔽Mor) = 𝔽Mor(A′,f.codom,f.vals)
convertA′A(f::𝔽Mor) = 𝔽Mor(A,f.codom,f.vals)
We could also change codomains in the same way.

Therefore, from a “morphism’s-eye” perspective, \{1,2,3,3\} and \{3,2,1\} behave the exact same way; defining a morphism out of one is the same as defining a morphism out of the other.

But if we take this to its logical conclusion, we find that this is true not only of \{1,2,3,3\} and of \{3,2,1\}, but also of \{1,2,3\} and \{\mathbf{a},\mathbf{b},\mathbf{c}\}!

To see this, let B = \{1,2,3\}, B^\prime = \{\mathbf{a}, \mathbf{b}, \mathbf{c}\}, and define f \colon B \to B^\prime and g \colon B^\prime \to B by

f(x) = \begin{cases} \mathbf{a} & \text{if $x = 1$} \\ \mathbf{b} & \text{if $x = 2$} \\ \mathbf{c} & \text{if $x = 3$} \end{cases}

g(y) = \begin{cases} 1 & \text{if $y = \mathbf{a}$} \\ 2 & \text{if $y = \mathbf{b}$} \\ 3 & \text{if $y = \mathbf{c}$} \end{cases}

Then given h \colon B \to C, we can produce h^\prime \colon B^\prime \to C via h^\prime = h \circ g, and vice versa. Moreover, when we start out with a function out of B, compose it with g to make a function out of B^\prime and then compose it with f to make a function out of B, we get the same function back.

Julia version

Specifically, we can do the following.

B, B′ = Vec𝔽([1,2,3]), Vec𝔽([:a,:b,:c])
f = 𝔽Mor(B, B′, Dict(1 => :a, 2 => :b, 3 => :c))
g = 𝔽Mor(B′, B, Dict(:a => 1, :b => 2, :c => 3))
convertBB′(h::𝔽Mor) = compose(g,h)
convertB′B(h::𝔽Mor) = compose(f,h)

All of this motivates the next few definitions.

The identity function 1_A on a finite set A has domain and codomain A and is defined by 1_A(a) = a.

Julia implementation
identity(A::𝔽) = 𝔽Mor(A, A, Dict(a => a for a in A))

We say that two finite sets A and A^\prime are isomorphic if there exists morphisms f \colon A \to A^\prime and g \colon A^\prime \to A such that g \circ f = 1_{A} and f \circ g = 1_{A^\prime}. Moreover, if this is the case we call f an isomorphism from A to A\prime, and g an isomorphism from A^\prime to A, and we say that g is the inverse of f.

You can remember the reasoning for why the name “isomorphic” by thinking “isomorphic = same morphisms”. That is, if we have an isomorphism between A and A^\prime, then there are “the same morphisms” out of A and out of A\prime.

However, as noted before, there might be several distinct isomorphisms between A and A^\prime. Thus, one must be careful to specify which isomorphism when you are talking about isomorphic finite sets.

Anyways, this is why it’s not a terrible problem to use a vector to represent a finite set. There are only rare cases where you can find a representation of your mathematical objects such that two representations are equal if and only if the mathematical objects are isomorphic. If you are lucky enough to find this, it’s called a “canonical form” and it’s a big deal. As a practical matter, we might use Set instead of Vector because it gets a bit closer to a canonical form, but I wanted to start with Vector to make the point that the representation of your mathematical object on a computer in general will not be canonical.


  • You can compose morphisms between finite sets
  • Isomorphisms tell you which finite sets are equivalent from the point of view of morphisms

Injectivity, Surjectivity, and Cardinality

We now discuss some more properties of finite sets and their maps.

A function f \colon A \to B is surjective if for every element b \in B, there exists an a \in A such that f(a) = b.

The function f \colon \{1,2,3\} \to \{1,2\} sending 1 to 1, 2 to 2, and 3 to 2 is surjective.

The function f \colon \{1,2\} \to \{1,2,3\} sending 1 to 1 and 2 to 2 is not surjective.

This is a good opportunity to discuss something very critical in math: ordering of quantifiers. Quantifiers are phrases like “for every …” or “there exists …”. In the previous definition, if we had instead said “there exists an a \in A such that for every b \in B, f(a) = b”, this could only be true if B had only a single element! The fact that the “there exists” comes after the “for every” allows us to choose a different a for each b.

Julia implementation
function is_surjective(f::𝔽Mor)
  seen = Set([f(x) for x in f.dom])
  all(y  seen for y in f.codom)

  Dict(1 => 1, 2 => 2, 3 => 2)

A function f \colon A \to B is injective if whenever a,a^{\prime} \in A are such that a \neq a^{\prime}, then f(a) \neq f(a^{\prime}).

The function f \colon \{1,2,3\} \to \{1,2\} sending 1 to 1, 2 to 2, and 3 to 2 is not injective.

The function f \colon \{1,2\} \to \{1,2,3\} sending 1 to 1 and 2 to 2 is injective.

Julia implementation
function is_injective(f::𝔽Mor)
  length(unique!(collect(f.dom))) ==
    length(unique!([f(x) for x in f.dom]))

  Dict(1 => 1, 2 => 2, 3 => 2)

A function f \colon A \to B has an inverse if and only if it is surjective and injective.

Proof. First suppose that f is surjective and injective. Then for each b \in B, there is precisely one element a \in A such that f(a) = b. This is because surjectivity guarantees there is at least one, and injectivity guarantees that there is no more than one. Define g \colon B \to A by letting g(b) be this unique a. Then it is clear that g \circ f = 1_{A} and f \circ g = 1_B.

Conversely, suppose that f has an inverse, g. Then f is surjective, because each b \in B has an element g(b) \in A such that f(g(b)) = b. And f is injective, because if a \neq a^\prime, then g(f(a)) \neq g(f(a^{\prime})), so it must be the case that f(a) \neq f(a^{\prime}).

We wrap up this section with a way of telling when there exists any isomorphisms between two finite sets.

The cardinality of a finite set is the number of unique elements listed in that finite set.

If two sets have the same cardinality, then they are isomorphic.

Proof. We prove this by induction on the cardinality of the sets.

If A and B both have 0 elements, then they are the same set! So the identity function is the isomorphism.

Now, suppose that there exist isomorphisms between all sets of cardinality n, and let A and B be sets of cardinality n+1. Then we can write A = \{a\} \cup A^\prime and B = \{b\} \cup B^\prime for A^\prime and B^\prime sets of cardinality n. By the induction hypothesis, there exists an isomorphism f^{\prime} \colon A^\prime \to B^\prime. Define f \colon A \to B by

f(x) = \begin{cases} b & \text{if $x = a$} \\ f^\prime(x) & \text{if $x \in A^\prime$} \end{cases}

This is surjective, because it hits b by construction, and everything in B^\prime by the induction hypothesis. It is injective because for x, x^\prime \in A, with x \neq x^\prime, if x, x^\prime \in A^\prime then f(x) \neq f(x^\prime), and if one of x, x^\prime is a, then f(x) \neq f(x^\prime) because f(a) \notin B^\prime.

We are done.

Proof in Julia
function find_isomorphism(A::𝔽, B::𝔽)
  # Deduplicate the finite sets, and fix an ordering
  A_vec, B_vec = unique!.([Any[A...], Any[B...]])

  @assert length(A_vec) == length(B_vec)
  n = length(A_vec)

  𝔽Mor(A, B, Dict(A_vec[i] => B_vec[i] for i in 1:n))
find_isomorphism(Vec𝔽([:c, :b, :a, :b]), Int𝔽(3)).vals
Dict{Any, Any} with 3 entries:
  :a => 3
  :b => 2
  :c => 1


  • There are two special classes of functions: injections and surjections
  • If something is an injection and a surjection, it is an isomorphism
  • You can tell which finite sets are isomorphic by looking at their cardinalities

The Pigeonhole Principle

The point of this first lecture is to introduce you to finite sets and pure math, not category theory. Therefore, we end with a discussion of a theorem from combinatorics.

If X and Y are finite sets, and X has a larger cardinality than Y, then for any morphism f \colon X \to Y, there exist distinct elements x, x' \in X such that f(x) = f(x'). In other words, f is not injective.

Proof. Suppose that no such x,x' existed. Then if X = \{x_1,\ldots,x_n\} with x_i \neq x_j for all i,j, we have f(x_1),\ldots,f(x_n) all distinct. Therefore, there are at least n distinct elements of Y, which contradicts the statement that Y has a smaller cardinality than X. We are done.

Suppose that x_1,\ldots,x_5 are points on a sphere. I.e., each x_i is a 3-dimensional vector with distance from the origin 1. Then there is a hemisphere that contains 4 of the points.

Proof. Pick any two points. These two points determine a great circle of the sphere. Then at least two of the other three points must fall on one of the hemispheres determined by that great circle.

Julia Implementation
function pigeonhole(f::𝔽Mor)
  @assert length(unique!([f.dom...])) > length(unique!([f.codom...]))
  holes = Dict(y => Any[] for y in f.codom)
  for pigeon in f.dom
    push!(holes[f(pigeon)], pigeon)
  for hole in holes
    if length(hole) > 1
      return hole


  • You can do basic combinatorics in the framework we have developed in this lecture