This textbook on the mathematics of data has two intended audiences:
For students majoring in math (or other quantitative fields like physics, economics, engineering, etc.):
it is meant as an invitation to data science and AI
from a rigorous mathematical perspective.
For (mathematically-inclined) students in data science related fields (at the undergraduate or graduate level):
it can serve as a mathematical companion to machine learning, AI, and statistics courses.
Content-wise it is a second course in linear algebra, multivariable calculus, and probability theory
motivated by and illustrated on data science applications. As such, the reader is expected to be familiar with the basics
of those areas, as well as to have been exposed to proofs -- but no knowledge of data science is assumed.
Moreover, while the emphasis is on mathematical concepts, programming is used throughout.
Basic familiarity with Python will suffice.
The book provides an introduction to some specialized packages,
especially
Numpy,
NetworkX,
and PyTorch.
Archive
It is based on Jupyter notebooks that were developed for
MATH 535: MATHEMATICAL METHODS IN DATA SCIENCE, a one-semester advanced undergraduate and Master's level course
offered at UW-Madison.
Websites from previous semesters are below. Warning: They
are no longer maintained and may have broken links.
Links to specific chapters are below, together some additional materials (assignments, Jupyter notebooks,
datasets, auto-quizzes, etc.). Most of these resources are also available on the GitHub page of the
book.
Exercises:
Assignments and practice exams for Spring 2024 follow.