Autoplay
Autocomplete
Previous Lesson
Complete and Continue
Flamethrower Core
Course Introduction
Welcome to the Course
Why Build Your Own Deep Learning Library?
Tools and Workflows for Deep Learning
Components of a Deep Learning Library
Why Do We Care So Much About Neural Networks?
The Universal Approximation Theorem for Neural Networks
The Curse of Dimensionality (22:54)
Automatic Differentiation
What Automatic Differentiation Isn't (25:12)
The Types of Automatic Differentiation Systems (20:25)
Computational Graphs, Compositions of Primitives, and Reverse Mode Differentation
Forward Mode Differentiation and Dual Numbers (6:14)
Nodes: Building a Computational Graph (42:48)
Variables (27:34)
Tensors (39:26)
Creating Callable Primitives and Tracing a Computation (56:19)
Creating a Tensor Library (24:13)
An Introduction to Numpy Broadcasting (16:27)
Differentiating Primitives (25:43)
Validating Gradients via Gradient Checking (41:04)
Implementing a Topological Sort - Kahn's Algorithm (16:19)
Implementing a Topological Sort - Depth First Search (23:42)
Implementing a Topological Sort - Faster Depth First Search (10:31)
Understanding Backpropagation (22:37)
Implementing Backpropagation (37:31)
Neural Network Module
The History of Neural Networks
The Module Base Class (19:54)
Practicalities: Using Logging
Parameter Initialization Strategies - Xavier Initialization (19:13)
Parameter Initialization Strategies - Glorot Uniform Initialization (6:10)
Parameter Initialization Strategies - The Implementation (11:24)
The Linear Layer (11:41)
Regression and Classification (10:35)
Activation Functions (17:35)
Statistical Estimators, Underfitting, and Overfitting (23:28)
Regularization for Better Generalization (25:48)
Regularization - Label Smoothing (29:16)
Regularization - The Elastic Net (8:20)
Implementing Regularization (7:04)
A Drop of Dropout - Part 1 (4:04)
A Drop of Dropout - Part 2 (22:50)
Implementing Dropout (8:38)
A Batch of Batch Norm (18:08)
Implementing Batch Norm (6:28)
The Optimization Module
An Introduction to Neural Network Optimization (21:04)
The Optimizer Base Class (13:08)
Loss Functions - Introduction (7:37)
Loss Functions - Maximum Likelihood Estimation, KL Divergence, Wasserstein Metrics (14:09)
Loss Functions - Deriving Mean Squared Error (17:05)
Loss Functions - Deriving Cross Entropy Loss (13:24)
Loss Functions - Maximum A Posteriori (13:24)
Loss Functions - Maximum a Posteriori with a Gaussian Prior (7:38)
Loss Functions - Maximum a Posteriori with a Laplacian Prior (9:14)
Loss Functions - The Implementation (14:35)
Gradient Descent - The Theory (41:01)
Improving Gradient Descent - Momentum (13:19)
Improving Gradient Descent - Nesterov Momentum (5:58)
Gradient Descent - The Implementation (11:15)
Scheduling Learning Rates (33:34)
Hyperparameter Search (12:53)
Pulling it all Together
The Training Loop (11:45)
Projects
Building a Feedforward Network to Classify MNIST Digits (20:33)
Resources, References, and Course Credits
Resources and References
Course Credits
Gradient Descent - The Implementation
Lesson content locked
If you're already enrolled,
you'll need to login
.
Enroll in Course to Unlock