Math for Data Science

Summary

Math for Data Science covers the portions of linear algebra, calculus, probability, and statistics prerequisite to Data Science, and applies these topics to two central problems in data science: principal component analysis (PCA) and neural network training. While PCA primarily relies on linear algebra, neural network training combines multiple mathematical tools.

The highlight of the book is the machine learning chapter, where the results of the previous chapters are applied to neural network training and stochastic gradient descent. Also included in this last chapter are advanced topics such as accelerated gradient descent and logistic regression trainability.

Examples are supported by detailed figures and Python code, and Jupyter notebooks and CSV files are available on this website. More than 380 exercises and nine detailed appendices covering background material are provided to aid understanding.

A neural network is a function defined by parameters, or weights. Given a large dataset of inputs, the goal is to adjust these weights so the network's outputs closely match the desired targets. This is achieved by minimizing the error between actual and expected outcomes using gradient descent, which navigates the error landscape in weight space.

Historically, training neural networks at scale was impractical due to the large number of weights involved. A breakthrough came with stochastic gradient descent (SGD), first introduced in the 1950s and widely applied to neural networks in the 1980s. SGD enables convergence to a minimum error by following approximations of the true gradient, even when those approximations are noisy.

While computing the full gradient requires summing over many terms, SGD estimates the gradient using small subsets of data, known as mini-batches. This reduces computational demands while maintaining convergence, albeit at the cost of longer training times. Despite this trade-off, SGD has made large-scale neural network training feasible, paving the way for recent deep learning advancements.

Book Preview

The book preview below contains only the first section of each chapter.

To magnify the book page, click away from the book frame to remove the focus, then zoom into the web page using ctrl+ or gestures.

For presentation mode, click >> at top right corner of book frame.