Deep Learning Training

CSI 5180 - Machine Learning for Bioinformatics

Important

Assignment 2 is due on March 26, 2025.

Prepare

  • TensorFlow Playground
    • Dataset Options: Users can choose from four types of datasets: circular, XOR, Gaussian, and spiral.
    • Feature Engineering: Enables the creation of new features to improve model performance.
    • Model Architecture: Allows customization of neural network architecture, including varying the number of layers and neurons per layer.
    • Hyperparameter Tuning: Provides options to adjust learning rate, activation functions, regularization techniques, and task specifications to observe their effects on model training.
    • Suggestion 1: For the Gaussian dataset, which is linearly separable, configure a network without hidden layers and a single output neuron using the sigmoid activation function. This setup effectively constructs a logistic regression model.
    • Suggestion 2: The circular dataset is not linearly separable using only the original features \(x_1\) and \(x_2\). However, by creating new features, \(x_1^2\) and \(x_2^2\), the problem becomes linearly separable in the transformed feature space. A network with no hidden layers and a single output node is sufficient for this task.
  • Consult Zou et al. (2019) and its Tutorial on Google Colab.

Participate

Videos

3Blue1Brown

In my opinion, this is an excellent and informative video. It is highly recommended that you watch this video. While it covers the concepts we have already explored, it presents the material in a manner that is challenging to replicate in a classroom setting.

  • Provides a clear explanation of the intuition behind the effectiveness of neural networks, detailing the hierarchy of concepts briefly mentioned in the last lecture.
  • Offers a compelling rationale for the necessity of a bias term.
  • Similarly, elucidates the concept of activation functions and the importance of a squashing function.
  • The segment beginning at 13m 26s offers a visual explanation of the linear algebra involved: \(\sigma(W X^T + b)\).

3Blue1Brown (Continued)

A series of videos, with animations, providing the intuition behind the backpropagation algorithm.

Prerequisite: Gradient descent, how neural networks learn? (20m 33s)

StatQuest

Prerequisites: The Chain Rule (18m 24s) & Gradient Descent, Step-by-Step (23m 54s)

Herman Kamper

One of the most thorough series of videos on the backpropagation algorithm.

References

Zou, James, Mikael Huss, Abubakar Abid, Pejman Mohammadi, Ali Torkamani, and Amalio Telenti. 2019. “A Primer on Deep Learning in Genomics.” Nature Genetics 51 (1): 12–18. https://doi.org/10.1038/s41588-018-0295-5.