Machine Learning Engineering
CSI 5180 - Machine Learning for Bioinformatics
Important
I have now published the descriptions for assignment 2 on the course website. You can access it through the following link:
- Assignment 2 (outline due: March 26, 2025)
Prepare
- TensorFlow Playground
- Dataset Options: Users can choose from four types of datasets: circular, XOR, Gaussian, and spiral.
- Feature Engineering: Enables the creation of new features to improve model performance.
- Model Architecture: Allows customization of neural network architecture, including varying the number of layers and neurons per layer.
- Hyperparameter Tuning: Provides options to adjust learning rate, activation functions, regularization techniques, and task specifications to observe their effects on model training.
- Suggestion 1: For the Gaussian dataset, which is linearly separable, configure a network without hidden layers and a single output neuron using the sigmoid activation function. This setup effectively constructs a logistic regression model.
- Suggestion 2: The circular dataset is not linearly separable using only the original features \(x_1\) and \(x_2\). However, by creating new features, \(x_1^2\) and \(x_2^2\), the problem becomes linearly separable in the transformed feature space. A network with no hidden layers and a single output node is sufficient for this task.
- Consult Zou et al. (2019) and its Tutorial on Google Colab.