This textbook on the mathematics of data has two intended audiences:
For students majoring in math (or other quantitative fields like physics, economics, engineering, etc.):
it is meant as an invitation to data science and AI
from a rigorous mathematical perspective.
For (mathematically-inclined) students in data science related fields (at the undergraduate or graduate level):
it can serve as a mathematical companion to machine learning, AI, and statistics courses.
Content-wise it is a second course in linear algebra, multivariable calculus, and probability theory
motivated by and illustrated on data science applications. As such, the reader is expected to be familiar with the basics
of those areas, as well as to have been exposed to proofs -- but no knowledge of data science is assumed.
Moreover, while the emphasis is on mathematical concepts, programming is used throughout.
Basic familiarity with Python will suffice.
The book provides an introduction to some specialized packages,
especially
Numpy,
NetworkX,
and PyTorch.
It is based on Jupyter notebooks that were developed for
MATH 535: MATHEMATICAL METHODS IN DATA SCIENCE, a one-semester advanced undergraduate and Master's level course
offered at UW-Madison.
A print version of the book will be published by Cambridge University Press.
Course Information (Spring 2025)
Course: MATH 535: Mathematical Methods in Data Science
Prerequisites: (MATH 320, 340, 341, 375 or COMP SCI/E C E/M E 532) and (MATH/STAT 309, 431, MATH 531, STAT 311 or E C E 331) and (MATH 322, 341, 375, 421, 467, or COMP SCI 577), graduate/professional standing, or member of Pre-Masters Mathematics (Visiting Intl) Prgrm
Links to specific chapters are below, together some additional materials (assignments, Jupyter notebooks,
datasets, auto-quizzes, etc.). Most of these resources are also available on the GitHub page of the
book.
Python package: To run some of the code below, you will need
mmids.py.