Skip to content
Machine Learning

Optimization and Gradient Descent

Account required to view full content

Training a machine learning model means choosing parameters that make a loss function as small as possible. For a handful of models, like ordinary least squares, there is a closed-form answer (the normal equations). For almost everything else, from logistic regression to a deep neural network, there is no closed form, or it is too expensive to compute, so we minimize the loss iteratively. Gradient descent is the workhorse that does this, and it is one of the most common conceptual questions in a quant interview. This lesson gives you the working intuition, the exact update rule, hand-computed examples, and the pitfalls interviewers like to probe.