The Cartesian Cafe

The Cartesian Cafe is the podcast where an expert guest and Timothy Nguyen map out scientific and mathematical subjects in detail. This collaborative journey with other experts will have us writing down formulas, drawing pictures, and reasoning about them together on a whiteboard. If you’ve been longing for a deeper dive into the intricacies of scientific subjects, then this is the podcast for you. Topics covered include mathematics, physics, computer science, machine learning, and artificial intelligence. Content also viewable on YouTube: and Spotify. Timothy Nguyen is a mathematician and AI researcher working in industry. Homepage:, Twitter: @IAmTimNguyen Patreon:

Listen on:

  • Apple Podcasts
  • Podbean App


Wednesday Jan 04, 2023

Greg Yang is a mathematician and AI researcher at Microsoft Research who for the past several years has done incredibly original theoretical work in the understanding of large artificial neural networks. Greg received his bachelors in mathematics from Harvard University in 2018 and while there won the Hoopes prize for best undergraduate thesis. He also received an Honorable Mention for the Morgan Prize for Outstanding Research in Mathematics by an Undergraduate Student in 2018 and was an invited speaker at the International Congress of Chinese Mathematicians in 2019.
In this episode, we get a sample of Greg's work, which goes under the name "Tensor Programs" and currently spans five highly technical papers. The route chosen to compress Tensor Programs into the scope of a conversational video is to place its main concepts under the umbrella of one larger, central, and time-tested idea: that of taking a large N limit. This occurs most famously in the Law of Large Numbers and the Central Limit Theorem, which then play a fundamental role in the branch of mathematics known as Random Matrix Theory (RMT). We review this foundational material and then show how Tensor Programs (TP) generalizes this classical work, offering new proofs of RMT. We conclude with the applications of Tensor Programs to a (rare!) rigorous theory of neural networks.
Part I. Introduction
00:00:00 : Biography
00:02:45 : Harvard hiatus 1: Becoming a DJ
00:07:40 : I really want to make AGI happen (back in 2012)
00:09:09 : Impressions of Harvard math
00:17:33 : Harvard hiatus 2: Math autodidact
00:22:05 : Friendship with Shing-Tung Yau
00:24:06 : Landing a job at Microsoft Research: Two Fields Medalists are all you need
00:26:13 : Technical intro: The Big Picture
00:28:12 : Whiteboard outline
Part II. Classical Probability Theory
00:37:03 : Law of Large Numbers
00:45:23 : Tensor Programs Preview
00:47:26 : Central Limit Theorem
00:56:55 : Proof of CLT: Moment method
1:00:20 : Moment method explicit computations
Part III. Random Matrix Theory
1:12:46 : Setup
1:16:55 : Moment method for RMT
1:21:21 : Wigner semicircle law
Part IV. Tensor Programs
1:31:03 : Segue using RMT
1:44:22 : TP punchline for RMT
1:46:22 : The Master Theorem (the key result of TP)
1:55:04 : Corollary: Reproof of RMT results
1:56:52 : General definition of a tensor program
Part V. Neural Networks and Machine Learning
2:09:05 : Feed forward neural network (3 layers) example
2:19:16 : Neural network Gaussian Process
2:23:59 : Many distinct large N limits for neural networks
2:27:24 : abc parametrizations (Note: "a" is absorbed into "c" here): variance and learning rate scalings
2:36:54 : Geometry of space of abc parametrizations
2:39:41: Kernel regime
2:41:32 : Neural tangent kernel
2:43:35: (No) feature learning
2:48:42 : Maximal feature learning
2:52:33 : Current problems with deep learning
2:55:02 : Hyperparameter transfer (muP)
3:00:31 : Wrap up
Further Reading:
Tensor Programs I, II, III, IV, V by Greg Yang and coauthors.
Twitter: @iamtimnguyen

Tuesday Nov 22, 2022

Scott Aaronson is a professor of computer science at University of Texas at Austin and director of its Quantum Information Center. Previously he received his PhD at UC Berkeley and was a faculty member at MIT in Electrical Engineering and Computer Science from 2007-2016. Scott has won numerous prizes for his research on quantum computing and complexity theory, including the Alan T Waterman award in 2012 and the ACM Prize in Computing in 2020. In addition to being a world class scientist, Scott is famous for his highly informative and entertaining blog Schtetl Optimized, which has kept the scientific community up to date on quantum hype for nearly the past two decades.
In this episode, Scott Aaronson gives a crash course on quantum computing, diving deep into the details, offering insights, and clarifying misconceptions surrounding quantum hype.
Correction: 59:03: The matrix denoted as "Hadamard gate" is actually a 45 degree rotation matrix. The Hadamard gate differs from this matrix by a sign flip in the last column. See 1:11:00 for the Hadamard gate.
Part I. Introduction (Personal)
00:00: Biography
01:02: Shtetl Optimized and the ways of blogging
09:56: sabattical at OpenAI, AI safety, machine learning
10:54: "I study what we can't do with computers we don't have"
Part II. Introduction (Technical)
22:57: Overview
24:13: SMBC Cartoon: "The Talk". Summary of misconceptions of the field
33:09: How all quantum algorithms work: choreograph pattern of interference
34:38: Outline
Part III. Setup
36:10: Review of classical bits
40:46: Tensor product and computational basis
42:07: Entanglement
44:25: What is not spooky action at a distance
46:15: Definition of qubit
48:10: bra and ket notation
50:48: Superposition example
52:41: Measurement, Copenhagen interpretation
Part IV. Working with qubits
57:02: Unitary operators, quantum gates
1:03:34: Philosophical aside: How to "store" 2^1000 bits of information.
1:08:34: CNOT operation
1:09:45: quantum circuits
1:11:00: Hadamard gate
1:12:43: circuit notation, XOR notation
1:14:55: Subtlety on preparing quantum states
1:16:32: Building and decomposing general quantum circuits: Universality
1:21:30: Complexity of circuits vs algorithms
1:28:45: How quantum algorithms are physically implemented
1:31:55: Equivalence to quantum Turing Machine
Part V. Quantum Speedup
1:35:48: Query complexity (black box / oracle model)
1:39:03: Objection: how is quantum querying not cheating?
1:42:51: Defining a quantum black box
1:45:30: Efficient classical f yields efficient U_f
1:47:26: Toffoli gate
1:50:07: Garbage and quantum uncomputing
1:54:45: Implementing (-1)^f(x))
1:57:54: Deutsch-Jozsa algorithm: Where quantum beats classical
2:07:08: The point: constructive and destructive interference
Part VI. Complexity Classes
2:08:41: Recap. History of Simon's and Shor's Algorithm
2:14:42: BQP
2:18:18: EQP
2:20:50: P
2:22:28: NP
2:26:10: P vs NP and NP-completeness
2:33:48: P vs BQP
2:40:48: NP vs BQP
2:41:23: Where quantum computing explanations go off the rails
Part VII. Quantum Supremacy
2:43:46: Scalable quantum computing
2:47:43: Quantum supremacy
2:51:37: Boson sampling
2:52:03: What Google did and the difficulties with evaluating supremacy
3:04:22: Huge open question
Twitter: @IAmTimNguyen

Thursday Oct 13, 2022

Grant Sanderson is a mathematician who is the author of the YouTube channel “3Blue1Brown”, viewed by millions for its beautiful blend of visual animation and mathematical pedagogy. His channel covers a wide range of mathematical topics, which to name a few include calculus, quaternions, epidemic modeling, and artificial neural networks. Grant received his bachelor's degree in mathematics from Stanford University and has worked with a variety of mathematics educators and outlets, including Khan Academy, The Art of Problem Solving, MIT OpenCourseWare, Numberphile, and Quanta Magazine.
In this episode, we discuss the famous unsolvability of quintic polynomials: there exists no formula, consisting only of finitely many arithmetic operations and radicals, for expressing the roots of a general fifth degree polynomial in terms of the polynomial's coefficients. The standard proof that is taught in abstract algebra courses uses the machinery of Galois theory. Instead of following that route, Grant and I proceed in barebones style along (somewhat) historical lines by first solving quadratics, cubics, and quartics. Along the way, we present the insights obtained by Lagrange that motivate a very natural combinatorial question, which contains the germs of modern group theory and Galois theory and whose answer suggests that the quintic is unsolvable (later confirmed through the work of Abel and Galois). We end with some informal discussions about Abel's proof and the topological proof due to Vladimir Arnold.
Part I. Introduction
00:52: How did you get interested in math?
06:30: Future of math pedagogy and AI 
12:03: Overview. How Grant got interested in unsolvability of the quintic
15:26: Problem formulation
17:42: History of solving polynomial equations
19:50: Po-Shen Loh 
Part II. Working Up to the Quintic
28:06: Quadratics
34:38 : Cubics
37:20: Viete’s formulas
48:51: Math duels over solving cubics: del Ferro, Fiorre, Tartaglia, Cardano, Ferrari
53:24: Prose poetry of solving cubics
54:30: Cardano’s Formula derivation
1:03:22: Resolvent 
1:04:10: Why exactly 3 roots from Cardano’s formula?
Part III. Thinking More Systematically
1:12:25: Takeaways and Lagrange’s insight into why quintic might be unsolvable
1:17:20: Origins of group theory?
1:23:29: History’s First Whiff of Galois Theory
1:25:24: Fundamental Theorem of Symmetric Polynomials
1:30:18: Solving the quartic from the resolvent
1:40:08: Recap of overall logic
Part IV. Unsolvability of the Quintic
1:52:30: S_5 and A_5 group actions
2:01:18: Lagrange’s approach fails!
2:04:01: Abel’s proof
2:06:16: Arnold’s Topological Proof
2:18:22: Closing Remarks
Further Reading on Arnold's Topological Proof of Unsolvability of the Quintic:
L. Goldmakher.
B. Katz.
Twitter: @iamtimnguyen

Wednesday Sep 07, 2022

John Baez is a mathematical physicist, professor of mathematics at UC Riverside, a researcher at the Centre for Quantum Technologies in Singapore, and a researcher at the Topos Institute in Berkeley, CA. John has worked on an impressively wide range of topics, pure and applied, ranging from loop quantum gravity, applications of higher categories to physics, applied category theory, environmental issues and math related to engineering and biology, and most recently on applying network theory to scientific software.Additionally, John is a prolific writer and blogger. This first began with John’s column This Week's Finds in Mathematical Physics, which ran 300 issues between 1993 and 2010, which then continued in the form of his ongoing blog Azimuth. Last but not least, John is also a host and contributor of the popular blog The n-category Cafe.
In this episode, we dive into John Baez and John Huerta’s paper “The Algebra of Grand Unified Theories” which was awarded the Levi Conant Prize in 2013. The paper gives a crash course in the representation theory underlying the Standard Model of particle physics and its three most well known Grand Unified Theories (GUTs): the SU(5), SO(10) (aka Spin(10)), and Pati-Salam theories. The main result of Baez-Huerta is that the particle representations underlying the three GUTs can in fact be unified via a commutative diagram. We dive deep into the numerology of the standard model to see how the SU(5) theory naturally arises. We then make brief remarks about SO(10) and Pati-Salam theories in order to state the Baez-Huerta theorem about their organization into a commutative square: a unification among grand unifications!
1:29:01: The formula for hypercharge in the bottom right note should be Y = 2(Q-I_3) instead of Y = (Q-I_3)/2.
While we do provide a crash course on SU(2) and spin, some representation theory jargon is used at times in our discussion. Those unfamiliar should just forge ahead!
We work in Euclidean signature instead of Lorentzian signature. Other than keeping track of minus signs, no essential details are changed. 
Part I. Introduction
00:00: Introduction
05:50: Climate change
09:40: Crackpot index
14:50: Eric Weinstein, Brian Keating, Geometric Unity
18:13: Overview of “The Algebra of Grand Unified Theories” paper
25:40: Overview of Standard Model and GUTs
34:25: SU(2), spin, isospin of nucleons 40:22: SO(4), Spin(4), double cover
44:24: three kinds of spin
Part II. Zoology of Standard Model
49:35: electron and neutrino
58:40: quarks
1:04:51: the three generations of the Standard Model
1:08:25: isospin quantum numbers
1:17:11: U(1) representations (“charge”)
1:29:01: hypercharge
1:34:00: strong force and color
1:36:50: SU(3)
1:40:45: antiparticles
Part III. SU(5) numerology
1:41:16: 32 = 2^5 particles
1:45:05: Mapping SU(3) x SU(2) x U(1) to SU(5) and hypercharge matching
2:05:17: Exterior algebra of C^5 and more hypercharge matching
2:37:32: SU(5) rep extends Standard Model rep
Part IV. How the GUTs fit together
2:41:42: SO(10) rep: brief remarks
2:46:28: Pati-Salam rep: brief remarks
2:47:17: Commutative diagram: main result
2:49:12: What about the physics? Spontaneous symmetry breaking and the Higgs mechanism
Twitter: @iamtimnguyen

Monday Aug 22, 2022

Tai-Danae Bradley is a mathematician who received her Ph.D. in mathematics from the CUNY Graduate Center. She was formerly at Alphabet and is now at Sandbox AQ, a startup focused on combining machine learning and quantum physics. Tai-Danae is a visiting research professor of mathematics at The Master’s University and the executive director of the Math3ma Institute, where she hosts her popular blog on category theory. She is also a co-author of the textbook Topology: A Categorical Approach that presents basic topology from the modern perspective of category theory.
In this episode, we provide a compressed crash course in category theory. We provide definitions and plenty of basic examples for all the basic notions: objects, morphisms, categories, functors, natural transformations. We also discuss the first basic result in category theory which is the Yoneda Lemma. We conclude with a discussion of how Tai-Danae has used category-theoretic methods in her work on language modeling, in particular, in how the passing from syntax to semantics can be realized through category-theoretic notions.
Originally published on July 20, 2022 on YouTube:
00:00:00 : Introduction
00:03:07 : How did you get into category theory?
00:06:29 : Outline of podcast
00:09:21 : Motivating category theory
00:11:35 : Analogy: Object Oriented Programming
00:12:32 : Definition of category
00:18:50 : Example: Category of sets
00:20:17 : Example: Matrix category
00:25:45 : Example: Preordered set (poset) is a category
00:33:43 : Example: Category of finite-dimensional vector spaces
00:37:46 : Forgetful functor
00:39:15 : Fruity example of forgetful functor: Forget race, gender, we're all part of humanity!
00:40:06 : Definition of functor
00:42:01 : Example: API change between programming languages is a functor
00:44:23 : Example: Groups, group homomorphisms are categories and functors
00:47:33 : Resume definition of functor
00:49:14 : Example: Functor between poset categories = order-preserving function
00:52:28 : Hom Functors. Things are getting meta (no not the tech company)
00:57:27 : Category theory is beautiful because of its rigidity
01:00:54 : Contravariant functor
01:03:23 : Definition: Presheaf
01:04:04 : Why are things meta? Arrows, arrows between arrows, categories of categories, ad infinitum.
01:07:38 : Probing a space with maps (prelude to Yoneda Lemma)
01:12:10 : Algebraic topology motivated category theory
01:15:44 : Definition: Natural transformation
01:19:21 : Example: Indexing category
01:21:54 : Example: Change of currency as natural transformation
01:25:35 : Isomorphism and natural isomorphism
01:27:34 : Notion of isomorphism in different categories
01:30:00 : Yoneda Lemma
01:33:46 : Example for Yoneda Lemma: Identity functor and evaluation natural transformation
01:42:33 : Analogy between Yoneda Lemma and linear algebra
01:46:06 : Corollary of Yoneda Lemma: Isomorphism of objects = Isomorphism of hom functors
01:50:40 : Yoneda embedding is fully faithful. Reasoning about this.
01:55:15 : Language Category
02:03:10 : Tai-Danae's paper: "An enriched category theory of language: from syntax to semantics"
Further Reading:
Tai-Danae's Blog:
Tai-Danae Bradley. "What is applied category theory?"
Tai-Danae Bradley, John Terilla, Yiannis Vlassopoulos. "An enriched category theory of language: from syntax to semantics."

Monday Aug 22, 2022

John Urschel received his bachelors and masters in mathematics from Penn State and then went on to become a professional football player for the Baltimore Ravens in 2014. During his second season, Urschel began his graduate studies in mathematics at MIT alongside his professional football career. Urschel eventually decided to retire from pro football to pursue his real passion, the study of mathematics, and he completed his doctorate in 2021. Urschel is currently a scholar at the Institute for Advanced Study where he is actively engaged in research on graph theory, numerical analysis, and machine learning. In addition, Urschel is the author of Mind and Matter, a New York Times bestseller about his life as an athlete and mathematician, and has been named as one of Forbes 30 under 30 for being an outstanding young scientist.
In this episode, John and I discuss a hodgepodge of topics in spectral graph theory. We start off light and discuss the famous Braess's Paradox in which traffic congestion can be increased by opening a road. We then discuss the graph Laplacian to enable us to present Cheeger's Theorem, a beautiful result relating graph bottlenecks to graph eigenvalues. We then discuss various graph embedding and clustering results, and end with a discussion of the PageRank algorithm that powers Google search.
Originally published on June 9, 2022 on YouTube:
I. Introduction
00:00: Introduction
04:30: Being a professional mathematician and academia vs industry
09:41: John's taste in mathematics
13:00: Outline
17:23: Braess's Paradox: "Opening a highway can increase traffic congestion."
25:34: Prisoner's Dilemma. We need social forcing mechanisms to avoid undesirable outcomes (traffic jams).
II. Spectral Graph Theory Basics
31:20: What is a graph
36:33: Graph bottlenecks. Practical situations: Task assignment, the economy, organizational management.
42:44: Quantifying bottlenecks: Cheeger's constant
46:43: Cheeger's constant sample computations
52:07: NP Hardness
55:48: Graph Laplacian
1:00:27: Graph Laplacian: 1-dimensional example
III. Cheeger's Inequality and Harmonic Oscillators
1:07:35: Cheeger's Inequality: Statement
1:09:37: Cheeger's Inequality: A great example of beautiful mathematics
1:10:46: Cheeger's Inequality: Computationally tractable approximation of Cheeger's constant
1:19:16: Harmonic oscillators: Springs heuristic for lambda_2 and Cheeger's inequality
1:22:20: Interlude: Tutte's Spring Embedding Theorem (planar embeddings in terms of springs)
1:29:45: Interlude: Graph drawing using eigenfunction
IV. Graph bisection and clustering
1:38:26: Summary thus far and graph bisection
1:42:44: Graph bisection: Large eigenvalues for PCA vs low eigenvalues for spectral bisection
1:43:40: Graph bisection: 1-dimensional intuition
1:47:43: Spectral graph clustering (complementary to graph bisection)
V. Markov chains and PageRank
1:52:10: PageRank: Google's algorithm for ranking search results
1:53:44: PageRank: Markov chain (Markov matrix)
1:57:32: PageRank: Stationary distribution
2:00:20: Perron-Frobenius Theorem
2:06:10: Spectral gap: Analogy between bottlenecks for graphs and bottlenecks for Markov chain mixing
2:07:56: Conclusion: State of the field, Urschel's recent results
2:10:28: Joke: Two kinds of mathematicians
Further Reading:
A. Ng, M. Jordan, Y. Weiss. "On Spectral Clustering: Analysis and an algorithm"
D. Spielman. "Spectral and Algebraic Graph Theory"

Saturday Aug 20, 2022

Richard Easther is a scientist, teacher, and communicator. He has been a Professor of Physics at the University of Auckland for over the last 10 years and was previously a professor of physics at Yale University. As a scientist, Richard covers ground that crosses particle physics, cosmology, astrophysics and astronomy, and in particular, focuses on the physics of the very early universe and the ways in which the universe changes between the Big Bang and the present day.
In this episode, Richard and I discuss the details of cosmology at large, both technically and historically. We dive into Einstein's equations from general relativity and see what implications they have for an expanding universe alongside a discussion of the cast of characters involved in 20th century cosmology (Einstein, Hubble, Friedmann, Lemaitre, and others). We also discuss inflation, gravitational waves, the story behind Brian Keating's book Losing the Nobel Prize, and the current state of experiments and cosmology as a field.
Originally published on May 3, 2022 on YouTube:
00:00:00 : Introduction
00:02:42 : Astronomy must have been one of the earliest sciences
00:03:57 : Eric Weinstein and Geometric Unity
00:13:47 : Outline of podcast
00:15:10 : Brian Keating, Losing the Nobel Prize, Geometric Unity
00:16:38 : Big Bang and General Relativity
00:21:07 : Einstein's equations
00:26:27 : Einstein and Hilbert
00:27:47 : Schwarzschild solution (typo in video)
00:33:07 : Hubble
00:35:54 : One galaxy versus infinitely many
00:36:16 : Olbers' paradox
00:39:55 : Friedmann and FRLW metric
00:41:53 : Friedmann metric was audacious?
00:46:05 : Friedmann equation
00:48:36 : How to start a fight in physics: West coast vs East coast metric and sign conventions.
00:50:05 : Flat vs spherical vs hyperbolic space
00:51:40 : Stress energy tensor terms
00:54:15 : Conservation laws and stress energy tensor
00:58:28 : Acceleration of the universe
01:05:12 : Derivation of a(t) ~ t^2/3 from preceding computations
01:05:37 : a = 0 is the Big Bang. How seriously can we take this?
01:07:09 : Lemaitre
01:11:51 : Was Hubble's observation of an expanding universe in 1929 a fresh observation?
01:13:45 : Without Einstein, no General Relativity?
01:14:45 : Two questions: General Relativity vs Quantum Mechanics and how to understand time and universe's expansion velocity (which can exceed the speed of light!)
01:17:58 : How much of the universe is observable
01:24:54 : Planck length
01:26:33 : Physics down to the Big Bang singularity
01:28:07 : Density of photons vs matter
01:33:41 : Inflation and Alan Guth
01:36:49 : No magnetic monopoles?
01:38:30 : Constant density requires negative pressure
01:42:42 : Is negative pressure contrived?
01:49:29 : Marrying General Relativity and Quantum Mechanics
01:51:58 : Symmetry breaking
01:53:50 : How to corroborate inflation?
01:56:21 : Sabine Hossenfelder's criticisms
02:00:19 : Gravitational waves
02:01:31 : LIGO
02:04:13 : CMB (Cosmic Microwave Background)
02:11:27 : Relationship between detecting gravitational waves and inflation
02:16:37 : BICEP2
02:19:06 : Brian Keating's Losing the Nobel Prize and the problem of dust
02:24:40 : BICEP3
02:26:26 : Wrap up: current state of cosmology
Easther's blogpost on Eric Weinstein:
Vice article on Eric Weinstein and Geometric Unity:
Further learning:
Matts Roos. "Introduction to Cosmology"
Barbara Ryden. "Introduction to Cosmology"
Our Cosmic Mistake About Gravitational Waves:

Friday Aug 19, 2022

Po-Shen Loh is a professor at Carnegie Mellon University and a coach for the US Math Olympiad. He is also a social entrepreneur where he has his used his passion and expertise in mathematics in the service of education ( and epidemiology (
In this episode, we discuss the mathematics behind Loh's novel approach to contact tracing in the fight against COVID, which involves a beautiful blend of graph theory and computer science.
Originally published on March 3, 2022 on Youtube:
00:00:00 : Introduction
00:01:11 : About Po-Shen Loh
00:03:49 : NOVID app
00:04:47 : Graph theory and quarantining
00:08:39 : Graph adjacency definition for contact tracing
00:16:01 : Six degrees of separation away from anyone?
00:21:13 : Getting the game theory and incentives right
00:30:40 : Conventional approach to contact tracing
00:34:47 : Comparison with big tech
00:39:19 : Neighbor search complexity
00:45:15 : Watts-Strogatz small networks phenomenon
00:48:37 : Storing neighborhood information
00:57:00 : Random hashing to reduce computational burden
01:05:24 : Logarithmic probing of sparsity
01:09:56 : Two math PhDs struggle to do division
01:11:17 : Bitwise-or for union of bounded sets
01:16:21 : Step back and recap
01:26:15 : Tradeoff between number of hash bins and sparsity
01:29:12 : Conclusion
Further reading:
Po-Shen Loh. "Flipping the Perspective in Contact Tracing"


Wednesday Aug 17, 2022

Wednesday Aug 17, 2022

Hello everyone, this is Tim Nguyen and welcome to The Cartesian Cafe. On this podcast we embark on a collaborative journey with other experts, to discuss mathematical and scientific topics in faithful detail, which means writing down formulas, drawing pictures, and reasoning about them together on a whiteboard. If you’ve been longing for a deeper dive into the intricacies of scientific subjects, then this is the podcast for you. Welcome to The Cartesian Cafe.

Copyright 2022 All rights reserved.

Podcast Powered By Podbean

Version: 20240320