Lecture 1: Formal Math

Introduction

These lectures are meant to go from 0 to category theory as efficiently as possible. This means that we are going to optimize for being precise, for being clear, and for opening up new possibilities for you. We are not going to optimize for being entertaining or engaging; this is not a “monads as burritos” blog post or a popsci article about category theory. The answer to “why do we care about this” is often going to be “because it’s important later on”, and you are just going to have to trust me on that.

As we are going from 0, in this first lecture I plan to get you all aquainted with what it even means to “do math” at the level that category theory lives. This is a different kind of math than what you might have learned about in lower level math courses, and so sorting out from the beginning the mindset that you should have for the rest of the lectures is the most efficient use of this time. Generally, nobody tells you about this distinction and you have to work it out painfully over years of getting bad grades on university math homeworks; we don’t have time for that.

However, we will have code examples for you to play with, because most of you are programmers and thus making a connection between math and code should speed the learning process.

Finally, everyone at some point in this lecture will be frustrated by how pedantic I’m being. Sorry. I’m erring on the side of pedantry because there’s a “price is right” situation here: if I go too slow, we waste a bit of time, but if I go too fast we waste all of the time.

Definitions, Propositions, and Examples

Pure math consists of a series of definitions, propositions, and examples. In this document, we typeset these like

A definition introduces a new word, and I will always put that new word in bold. In normal speech, words have meanings given by context, by association, and only sometimes by explicit definition. In math, it is not like that. Every technical word has a single definition. That definition may not be written down explicitly; it may be agreed implicitly between mathematicians based on shared experience. However, in theory there is always a precise definition for every mathematical concept. It is expected that all participants in a mathematical conversation could be locked up in a cell, given paper, a pencil and a great deal of time, and then write down a fully rigorous formulation of each of the words they are using. Moreover, each of these formulations for each of the mathematicians might not be exactly the same, but they should be able to be proven equivalent.

Until you learn mathematical logic, which we will not cover here, this expectation cannot be realized because you don’t know a precise definition of “fully rigorous”. The level of rigor that will suffice for now is that you should be able to expand every definition to a level where it can be explained to a smart, patient human who knows no category theory, by going backwards and defining each of the terms used in that definition until you get to very basic concepts, like sets, functions, and equations.

The extremely important corollary to all of this is that if you feel like your understanding of a definition does not reach this level, YOU ARE CONFUSED. This is OK. It is good to be confused. It is far better to be confused and not yet wrong than it is to be unconfused and wrong.

What should you do when you are confused? First of all, GO BACKWARDS. Read the previous section of a textbook. If you are still confused, keep going backwards until you hit something that makes sense, and then work your way back up. Secondly, GO SIDEWAYS. Read another textbook that treats the same material in a different way. Then finally, if neither of those work, ASK AN EXPERT, and keep asking until you feel unconfused. The MOST IMPORTANT SKILL in math is to know when you are confused and don’t continue until you are unconfused! If you continue on, you will get hopelessly confused; if you turn back immediately there is still hope.

Definitions are the most important part of higher math. Understanding the definitions is often half the battle, and it is most of the battle for category theory.

A proposition is an assertion that one logical statement (the conclusion) follows from several logical statements (the premises). Each proposition comes along with a proof. Just like definitions, it is expected that the participants in a mathematical conversation could expand a proof out to a fully rigorous level, even if the given proof is very vague. What you write down as a proof should be seen as a “hint” for the construction of the actual, fully rigorous proof; mathematicians come to cultural agreements for how much of a hint is needed in different circumstances.

As a mathematical learner, proofs are your window into the thought processes of subject experts. Thus, they are very good to study and understand. However, they are not as critical to understand as definitions. It is absolutely essential to fully understand definitions, but proofs can be “blackboxed” sometimes, and you can just remember the proposition without understanding fully the proof.

Propositions are also known as theorems, lemmas, and corollaries. A theorem is an important proposition, a lemma is a small proposition mainly used to prove other propositions, and a corollary is a proposition whose proof is easy because of another proposition, for example a special case. Really, these are just vibes that mathematicians add to propositions.

Finally, an example is a definition or proposition that falls under one of three categories.

The purpose is mainly pedagogical; the example gives you intuition for another definition or proposition.
The purpose is mainly application-oriented; the example shows you how to apply something abstract in the real world.
It’s a normal proposition or definition, except it’s slightly less abstract than some previous proposition or definition. This is often the case in category theory; category theory is known as the subject where “the examples have examples.”

A pure math document consists of a sequence of definitions, propositions, and examples, punctuated with interleaving prose that attempts to give intuition for what the definitions, propositions, and examples are saying, and why one should care about those definitions, propositions, and examples. Intuition is a very important part of math; it is what allows mathematicians to elaborate definitions, discover proofs, and most importantly, to figure out what is important to study in the first place. However, intuition is no substitute for rigor. Intuition allows you to leap off cliffs; rigor is what allows you to build a bridge underneath you before you hit the ground.

In the foundations of math, we also have two more types of statement: axioms and undefined terms. Definitions and propositions should always “bottom out” at axioms and undefined terms. However, most mathematicians do not do this, instead leaving it to the reader to choose a suitable foundations of math in which to fully formalize their theories. Surprisingly, most interesting math can be fully formalized in many foundations, so this normally works out fine.

For us, our “reality” will be what happens on the computer. So we will try to “bottom out” on concepts in the computer.

Takeaways

Pure math consists of definitions, propositions, and examples.
These are specified in enough detail so that participants in a mathematical conversation could independently come up with equivalent elaborations.
If you feel you could not do this at any point, then you are in a perfectly normal situation and should not feel ashamed whatsoever. However, continuing on without first going back and understanding what you are confused about is a bad idea.

Finite Sets

We will now demonstrate the previous concepts by studying finite sets. We will not get to category theory today. Instead, we will revisit some things that should be familiar to you and treat them in the style that we will be using for the rest of the lectures.

Finite sets will be important for most of the applications of category theory that we will learn in the next lectures, and also most of the concepts of category theory are well-illustrated by finite sets. So a firm grasp of finite sets will be an immense aid in the coming weeks.

We start with a basic universe of discourse. It is traditional to be minimalistic with this universe of discourse, and say that everything is a set, or everything is a natural number. However, we choose to be untraditional.

A primitive thing is any possible value of a Julia variable.

1, :a, [1.0 0; 0 1] are all primitive things

A finite set is a list of primitive things. We typically write curly braces around this list.

\{1,2,3,4\}, \{\mathbf{a},\mathbf{b},4,2,2,6\} are both finite sets.

We might represent this in Julia with the following data structure.

abstract type 𝔽 end

struct Vec𝔽 <: 𝔽
  elems::Vector{Any}
end

Base.:(∈)(a, A::Vec𝔽) = a ∈ A.elems

Base.iterate(A::Vec𝔽) = iterate(A.elems)
Base.iterate(A::Vec𝔽, k) = iterate(A.elems, k)

A = Vec𝔽([:carrots, :peas])
B = Vec𝔽([3, 7, 8])

Note that this is not the only possible representation of a finite set. Definitions in math always can be translated into code in many ways; the choice of a particular way is a delicate balancing act between simplicity, performance, clarity, and completeness.

Another possible representation of finite sets is

struct Int𝔽 <: 𝔽
  n::Int
end

Base.:(∈)(a, A::Int𝔽) = 1 <= a <= A.n

Base.iterate(A::Int𝔽) = iterate(1:A.n)
Base.iterate(A::Int𝔽, k) = iterate(1:A.n, k)

This represents the finite set \{1,\ldots,n\}. Here we have traded performance over completeness; we can only represent some finite sets, but we represent those finite sets more efficiently.

A function f from a finite set A to a finite set B is something that associates a value f(a) in B to every thing a in A. A is called the domain of f, and B is called the codomain of f. We write f along with its domain and codomain as f \colon A \to B.

It is important to note that even if a is listed multiple times in A, f(a) only has one value.

We might represent a function between 𝔽s, also known as a morphism of finite sets, with the following data structure.

struct 𝔽Mor
  dom::𝔽
  codom::𝔽
  vals::Dict{Any,Any}
end

(f::𝔽Mor)(x) = f.vals[x]

f = 𝔽Mor(A, B, Dict(:carrots => 3, :peas => 3))

However, not all instances of this data structure represent functions. The following Julia function determines whether a morphism of finite sets is valid.

isvalid(f::𝔽Mor) =
  all(x ∈ keys(f.vals) && f(x) ∈ f.codom for x in f.dom)

[
  isvalid(f), 
  isvalid(𝔽Mor(A, B, Dict(:peas => 3))),
  isvalid(𝔽Mor(A, B, Dict(:peas => 5, :carrots => 3)))
]

3-element Vector{Bool}:
 1
 0
 0

Frequently we will write down Julia types representing mathematical concepts where not all values of that type are valid representations of that mathematical concept. This is unavoidable because Julia types do not have the specificity to narrow down the space of values enough. There are languages where the types can narrow down the space of values sufficiently, but none of those languages have well-maintained BLAS/LAPACK bindings.

For finite sets implemented by Int𝔽, we can give a more efficient encoding of morphism.

struct Int𝔽Mor
  dom::Int𝔽
  codom::Int𝔽
  vals::Vector{Int}
end

(f::Int𝔽Mor)(i) = f.vals[i]

with corresponding validation function

isvalid(f::Int𝔽Mor) = length(f.vals) == f.dom.n &&
  all(f(x) ∈ f.codom for x in f.dom)

isvalid(Int𝔽Mor(Int𝔽(3), Int𝔽(2), [1,2,2]))

true

From now on, we will work with only Vec𝔽s and 𝔽Mors, and leave the implementation of more efficient code to the reader.

Takeaways

A finite set is a list of things
A morphism of finite sets from A to B sends each unique element of A to an element of B
There are multiple ways of implementing representations of finite sets and morphisms between them on the computer