Deep Learning Training

CSI 5180 - Machine Learning for Bioinformatics

Important

Assignment 2 is due on March 26, 2025.

Prepare

TensorFlow Playground
- Dataset Options: Users can choose from four types of datasets: circular, XOR, Gaussian, and spiral.
- Feature Engineering: Enables the creation of new features to improve model performance.
- Model Architecture: Allows customization of neural network architecture, including varying the number of layers and neurons per layer.
- Hyperparameter Tuning: Provides options to adjust learning rate, activation functions, regularization techniques, and task specifications to observe their effects on model training.
- Suggestion 1: For the Gaussian dataset, which is linearly separable, configure a network without hidden layers and a single output neuron using the sigmoid activation function. This setup effectively constructs a logistic regression model.
- Suggestion 2: The circular dataset is not linearly separable using only the original features \(x_1\) and \(x_2\). However, by creating new features, \(x_1^2\) and \(x_2^2\), the problem becomes linearly separable in the transformed feature space. A network with no hidden layers and a single output node is sufficient for this task.
Consult Zou et al. (2019) and its Tutorial on Google Colab.

Participate

slides (PDF)

Videos

3Blue1Brown

In my opinion, this is an excellent and informative video. It is highly recommended that you watch this video. While it covers the concepts we have already explored, it presents the material in a manner that is challenging to replicate in a classroom setting.

Provides a clear explanation of the intuition behind the effectiveness of neural networks, detailing the hierarchy of concepts briefly mentioned in the last lecture.
Offers a compelling rationale for the necessity of a bias term.
Similarly, elucidates the concept of activation functions and the importance of a squashing function.
The segment beginning at 13m 26s offers a visual explanation of the linear algebra involved: \(\sigma(W X^T + b)\).

3Blue1Brown (Continued)

A series of videos, with animations, providing the intuition behind the backpropagation algorithm.

Neural networks (playlist)
- What is backpropagation really doing? (12m 47s)
- Backpropagation calculus (10m 18s)

Prerequisite: Gradient descent, how neural networks learn? (20m 33s)

StatQuest

Neural Networks Pt. 2: Backpropagation Main Ideas (17m 34s)
Backpropagation Details Pt. 1: Optimizing 3 parameters simultaneously (18m 32s)
Backpropagation Details Pt. 2: Going bonkers with The Chain Rule (13m 9s)

Prerequisites: The Chain Rule (18m 24s) & Gradient Descent, Step-by-Step (23m 54s)

Herman Kamper

One of the most thorough series of videos on the backpropagation algorithm.

Introduction to neural networks (playlist)
- Backpropagation (without forks) (31m 1s)
- Backprop for a multilayer feedforward neural network (4m 2s)
- Computational graphs and automatic differentiation for neural networks (6m 56s)
- Common derivatives for neural networks (7m 18s)
- A general notation for derivatives (in neural networks) (7m 56s)
- Forks in neural networks (13m 46s)
- Backpropagation in general (now with forks) (3m 42s)

References

Zou, James, Mikael Huss, Abubakar Abid, Pejman Mohammadi, Ali Torkamani, and Amalio Telenti. 2019. “A Primer on Deep Learning in Genomics.” Nature Genetics 51 (1): 12–18. https://doi.org/10.1038/s41588-018-0295-5.