Bias-Variance Tradeoff, Model Fitting

CSI 5180 - Machine Learning for Bioinformatics

Marcel Turcotte

Version: Mar 3, 2025 09:29

Preamble

Quote of the Day (1/2)

Quote of the Day (2/2)

Summary

In this lecture, we explore how model complexity influences bias, variance, and generalization by examining underfitting and overfitting through learning curves across various models, including linear, polynomial, tree-based, KNN, and deep networks.

Learning Outcomes

  • Grasp how model complexity affects bias, variance, and generalization.
  • Analyze learning curves to diagnose underfitting and overfitting.

Model Complexity

Rationale

Optimizing model performance critically depends on the careful selection and tuning of hyperparameters.

These hyperparameters play a pivotal role in regulating the complexity of machine learning models.

Definition

Model complexity in refers to the capacity of a model to capture intricate patterns in the data.

It is determined by the number of parameters or the structure of the model.

Exploration

Code
import numpy as np

np.random.seed(42)

X = 6 * np.random.rand(100, 1) - 3
y = 0.5 * X ** 2 - X + 2 + np.random.randn(100, 1)

import matplotlib as mpl
import matplotlib.pyplot as plt

plt.figure(figsize=(6,4))

plt.plot(X, y, "b.")
plt.xlabel("$x$", fontsize=18)
plt.ylabel("$y$", rotation=0, fontsize=18)
plt.axis([-3, 3, 0, 10])
plt.grid(True)
plt.show()

Linear Regression

Code
from sklearn.linear_model import LinearRegression

lin_reg = LinearRegression()
lin_reg.fit(X, y)

X_new = np.array([[-3], [3]])
y_pred = lin_reg.predict(X_new)

plt.figure(figsize=(6,4))

plt.plot(X, y, "b.")
plt.plot(X_new, y_pred, "r-")
plt.xlabel("$x$", fontsize=18)
plt.ylabel("$y$", rotation=0, fontsize=18)
plt.axis([-3, 3, 0, 10])
plt.show()

A linear model inadequately represents this dataset.

Definition

Feature engineering is the process of creating, transforming, and selecting variables (attributes) from raw data to improve the performance of machine learning models.

Machine Learning Engineering

PolynomialFeatures

from sklearn.preprocessing import PolynomialFeatures

poly_features = PolynomialFeatures(degree=2, include_bias=False)
X_poly = poly_features.fit_transform(X)
X[0]
array([-0.75275929])
X_poly[0]
array([-0.75275929,  0.56664654])

Generate a new feature matrix consisting of all polynomial combinations of the features with degree less than or equal to the specified degree. For example, if an input sample is two dimensional and of the form \([a, b]\), the degree-2 polynomial features are \([1, a, b, a^2, ab, b^2]\).

PolynomialFeatures

Given two features \(a\) and \(b\), PolynomialFeatures with degree=3 would add \(a^2\), \(a^3\), \(b^2\), \(b^3\), as well as, \(ab\), \(a^2b\), \(ab^2\)!

Warning

PolynomialFeatures(degree=d) adds \(\frac{(D+d)!}{d!D!}\) features, where \(D\) is the original number of features.

Polynomial Regression

Code
lin_reg = LinearRegression()
lin_reg = lin_reg.fit(X_poly, y)

X_new = np.linspace(-3, 3, 100).reshape(100, 1)
X_new_poly = poly_features.transform(X_new)
y_new = lin_reg.predict(X_new_poly)

plt.figure(figsize=(5, 3))
plt.plot(X, y, "b.")
plt.plot(X_new, y_new, "r-", linewidth=2, label="Predictions")
plt.xlabel("$x_1$")
plt.ylabel("$y$", rotation=0)
plt.legend(loc="upper left")
plt.axis([-3, 3, 0, 10])
plt.grid()
plt.show()

LinearRegression on PolynomialFeatures

lin_reg = LinearRegression()
lin_reg.fit(X_poly, y)

Polynomial Regression

The data was generated according to the following equation, with the inclusion of Gaussian noise.

\[ y = 0.5 x^2 - 1.0 x + 2.0 \]

Presented below is the learned model.

\[ \hat{y} = 0.56 x^2 + (-1.06) x + 1.78 \]

lin_reg.coef_, lin_reg.intercept_
(array([[-1.06633107,  0.56456263]]), array([1.78134581]))

Model Complexity

Code
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline

plt.figure(figsize=(5, 3))

for style, width, degree in (("r-+", 2, 1), ("b--", 2, 2), ("g-", 1, 300)):
    polybig_features = PolynomialFeatures(degree=degree, include_bias=False)
    std_scaler = StandardScaler()
    lin_reg = LinearRegression()
    polynomial_regression = make_pipeline(polybig_features, std_scaler, lin_reg)
    polynomial_regression.fit(X, y)
    y_newbig = polynomial_regression.predict(X_new)
    label = f"{degree} degree{'s' if degree > 1 else ''}"
    plt.plot(X_new, y_newbig, style, label=label, linewidth=width)

plt.plot(X, y, "b.", linewidth=3)
plt.legend(loc="upper left")
plt.xlabel("$x_1$")
plt.ylabel("$y$", rotation=0)
plt.axis([-3, 3, 0, 10])
plt.grid()
plt.show()

A low loss value on the training set does not necessarily indicate a “better” model.

Under- and Over- Fitting

  • Underfitting:
    • Your model is too simple (here, linear).
    • Uninformative features.
    • Poor performance on both training and test data.
  • Overfitting:
    • Your model is too complex (tall decision tree, deep and wide neural networks, etc.).
    • Too many features given the number of examples available.
    • Excellent performance on the training set, but poor performance on the test set.

Learning Curves

  • One way to assess our models is to visualize the learning curves:
    • A learning curve shows the performance of our model, here using RMSE, on both the training set and the test set.
    • Multiple measurements are obtained by repeatedly training the model on larger and larger subsets of the data.

Learning Curve – Underfitting

Code
from sklearn.model_selection import learning_curve

train_sizes, train_scores, valid_scores = learning_curve(
    LinearRegression(), X, y, train_sizes=np.linspace(0.01, 1.0, 40), cv=5,
    scoring="neg_root_mean_squared_error")

train_errors = -train_scores.mean(axis=1)
valid_errors = -valid_scores.mean(axis=1)

plt.figure(figsize=(6, 4))  # extra code – not needed, just formatting
plt.plot(train_sizes, train_errors, "r-+", linewidth=2, label="train")
plt.plot(train_sizes, valid_errors, "b-", linewidth=3, label="test")

# extra code – beautifies
plt.xlabel("Training set size")
plt.ylabel("RMSE")
plt.grid()
plt.legend(loc="upper right")
plt.axis([0, 80, 0, 2.5])
plt.show()

  • Polynomial with degree=1.

  • Poor performance on both training and test data.

Learning Curve – Overfitting

Code
from sklearn.pipeline import make_pipeline

polynomial_regression = make_pipeline(
    PolynomialFeatures(degree=14, include_bias=False),
    LinearRegression())

train_sizes, train_scores, valid_scores = learning_curve(
    polynomial_regression, X, y, train_sizes=np.linspace(0.01, 1.0, 40), cv=5,
    scoring="neg_root_mean_squared_error")
# extra code – generates and saves Figure 4–16

train_errors = -train_scores.mean(axis=1)
valid_errors = -valid_scores.mean(axis=1)

plt.figure(figsize=(6, 4))
plt.plot(train_sizes, train_errors, "r-+", linewidth=2, label="train")
plt.plot(train_sizes, valid_errors, "b-", linewidth=3, label="test")
plt.legend(loc="upper right")
plt.xlabel("Training set size")
plt.ylabel("RMSE")
plt.grid()
plt.axis([0, 80, 0, 2.5])
plt.show()

  • Polynomial with degree=14.

  • Excellent performance on the training set, but poor performance on the test set.

Overfitting - Deep Net - Loss

Overfitting - Deep Net - Accuracy

Bias-Variance Tradeoff

Bias

  • Bias refers to the error introduced by approximating a real-world problem, which may be complex, using a simplified model.

  • It represents the difference between the average prediction of the model and the true outcome.

  • High bias can cause an algorithm to miss important patterns, leading to underfitting.

Bias

\[ \text{Bias}(\hat{f}) = \mathbb{E}[\hat{f}(x)] - f(x) \] where:

  • \(\hat{f}(x)\) is the prediction made by the model,
  • \(f(x)\) is the true function,
  • \(\mathbb{E}[\hat{f}(x)]\) is the expected prediction over different datasets.

Variance

  • Variance measures the model’s sensitivity to fluctuations in the training data.

  • High variance indicates that the model is capturing noise as if it were a true pattern, leading to overfitting.

  • It reflects how much the predictions of the model would vary if different training data were used.

Variance

\[ \text{Variance}(\hat{f}) = \mathbb{E}[(\hat{f}(x) - \mathbb{E}[\hat{f}(x)])^2] \]

where:

  • \(\hat{f}(x)\) is the prediction made by the model,
  • \(\mathbb{E}[\hat{f}(x)]\) is the expected prediction over different datasets.

Remarks

  • Statistical learning makes assumptions about the model, data distribution, and noise to analytically derive expected values.

  • In practical applications, empirical techniques like cross-validation and bootstrapping are employed to estimate bias and variance.

Bais-Variance Tradeoff

\[ \text{Error} = \text{Bias}^2 + \text{Variance} + \text{Irreducible Error} \]

  • Model selection aims to minimize bias, which arises from overly simplistic models, and variance, which results from overly complex models prone to overfitting.

  • Ideally, with infinite data, both bias and variance could be reduced to zero.

  • However, in practice, data is typically noisy, and some irreducible error persists due to unaccounted factors beyond the model’s scope.

Bais-Variance Tradeoff

Bais-Variance Tradeoff

High Bias, Low Variance

Code
from sklearn.model_selection import KFold
from sklearn.metrics import mean_squared_error

def true_function(x):
    return np.sin(x)

def plot_fold_predictions(degree, X, y, X_grid, y_true_grid, n_splits=5, random_state=42):
    """
    For a given polynomial degree, perform KFold cross-validation,
    plot the individual fold predictions along with the average prediction 
    and the true function (with y-axis limited to [-2, 2]),
    and return predictions and errors.
    """
    kf = KFold(n_splits=n_splits, shuffle=True, random_state=random_state)
    fold_predictions = []  # To store predictions on the evaluation grid for each fold
    fold_errors = []       # To store test errors for each fold

    for train_index, test_index in kf.split(X):
        poly = PolynomialFeatures(degree=degree)
        X_train_poly = poly.fit_transform(X[train_index])
        X_test_poly = poly.transform(X[test_index])
        X_grid_poly = poly.transform(X_grid)
        
        model = LinearRegression()
        model.fit(X_train_poly, y[train_index])
        
        # Predictions on the dense grid for bias-variance analysis
        y_pred_grid = model.predict(X_grid_poly)
        fold_predictions.append(y_pred_grid)
        
        # Test error on held-out data
        y_pred_test = model.predict(X_test_poly)
        fold_errors.append(mean_squared_error(y[test_index], y_pred_test))
    
    fold_predictions = np.array(fold_predictions)
    avg_prediction = np.mean(fold_predictions, axis=0)
    
    # Plot individual fold predictions with y-axis limited to [-2, 2]
    plt.figure(figsize=(8, 5))
    for i in range(n_splits):
        plt.plot(X_grid, fold_predictions[i], color='gray', alpha=0.5, 
                 label='Fold prediction' if i == 0 else "")
    plt.plot(X_grid, avg_prediction, color='red', linewidth=2, label='Average prediction')
    plt.plot(X_grid, y_true_grid, color='blue', linewidth=2, label='True function')
    plt.scatter(X, y, color='black', s=20, label='Data points')
    plt.ylim(-2, 2)
    plt.title(f'Polynomial Degree {degree}')
    plt.xlabel('x')
    plt.ylabel('f(x)')
    plt.legend()
    plt.show()
    
    return fold_predictions, avg_prediction, fold_errors

# --- Data Generation with Increased Noise and Reduced Sample Size ---
np.random.seed(0)
n_samples = 40  # Reduced sample size increases model sensitivity to training data
X = np.linspace(0, 2 * np.pi, n_samples).reshape(-1, 1)
noise_std = 0.25  # Increased noise level amplifies prediction variability
y = true_function(X).ravel() + np.random.normal(0, noise_std, size=n_samples)

# Create a dense evaluation grid and compute the true function values
X_grid = np.linspace(0, 2 * np.pi, 100).reshape(-1, 1)
y_true_grid = true_function(X_grid).ravel()

# --- Plot Individual Fold Predictions for Selected Degrees ---
_ = plot_fold_predictions(1, X, y, X_grid, y_true_grid, n_splits=5)

Low Bias, High Variance

Code
_ = plot_fold_predictions(15, X, y, X_grid, y_true_grid, n_splits=5)

Just Right

Code
_ = plot_fold_predictions(3, X, y, X_grid, y_true_grid, n_splits=5)

Bias, Variance, and CV Error

Code
# --- Compute Bias², Variance, and CV Error Across Degrees 1 to 10 ---
degrees = range(1, 10)
bias_list = []
variance_list = []
cv_error_list = []

for degree in degrees:
    kf = KFold(n_splits=5, shuffle=True, random_state=42)
    fold_predictions = []
    fold_errors = []
    
    for train_index, test_index in kf.split(X):
        poly = PolynomialFeatures(degree=degree)
        X_train_poly = poly.fit_transform(X[train_index])
        X_test_poly = poly.transform(X[test_index])
        X_grid_poly = poly.transform(X_grid)
        
        model = LinearRegression()
        model.fit(X_train_poly, y[train_index])
        
        y_pred_grid = model.predict(X_grid_poly)
        fold_predictions.append(y_pred_grid)
        
        y_pred_test = model.predict(X_test_poly)
        fold_errors.append(mean_squared_error(y[test_index], y_pred_test))
    
    fold_predictions = np.array(fold_predictions)
    mean_prediction = np.mean(fold_predictions, axis=0)
    
    # Bias²: Average squared difference between the average prediction and the true function
    bias_sq = np.mean((mean_prediction - y_true_grid)**2)
    # Variance: Average variance of the predictions across the evaluation grid
    variance = np.mean(np.var(fold_predictions, axis=0))
    # CV Error: Mean of the MSE on held-out test sets
    cv_error = np.mean(fold_errors)
    
    bias_list.append(bias_sq)
    variance_list.append(variance)
    cv_error_list.append(cv_error)

# --- Plot Bias², Variance, and CV Error vs. Polynomial Degree ---
plt.figure(figsize=(8, 5))
plt.plot(degrees, bias_list, marker='o', label='Bias²')
plt.plot(degrees, variance_list, marker='o', label='Variance')
plt.plot(degrees, cv_error_list, marker='o', label='CV Error (MSE)')
plt.title('Bias, Variance, and CV Error vs. Polynomial Degree')
plt.xlabel('Polynomial Degree')
plt.ylabel('Error')
plt.ylim(0, 1)
plt.legend()
plt.show()

Regression Tree

Code
from sklearn.tree import DecisionTreeRegressor

def plot_tree_fold_predictions(max_depth, X, y, X_grid, y_true_grid, n_splits=5, random_state=42):
    """
    For a given tree max_depth, perform KFold cross-validation with a DecisionTreeRegressor,
    plot the individual fold predictions along with the average prediction and the true function.
    The y-axis is limited to [-2, 2] for clarity.
    """
    kf = KFold(n_splits=n_splits, shuffle=True, random_state=random_state)
    fold_predictions = []  # Store predictions on the evaluation grid for each fold
    fold_errors = []       # Store test errors for each fold

    for train_index, test_index in kf.split(X):
        X_train = X[train_index]
        X_test = X[test_index]
        
        model = DecisionTreeRegressor(max_depth=max_depth, random_state=random_state)
        model.fit(X_train, y[train_index])
        
        # Prediction on a dense evaluation grid
        y_pred_grid = model.predict(X_grid)
        fold_predictions.append(y_pred_grid)
        
        # Test error on held-out data
        y_pred_test = model.predict(X_test)
        fold_errors.append(mean_squared_error(y[test_index], y_pred_test))
    
    fold_predictions = np.array(fold_predictions)
    avg_prediction = np.mean(fold_predictions, axis=0)
    
    plt.figure(figsize=(8, 5))
    for i in range(n_splits):
        plt.plot(X_grid, fold_predictions[i], color='gray', alpha=0.5,
                 label='Fold prediction' if i == 0 else "")
    plt.plot(X_grid, avg_prediction, color='red', linewidth=2, label='Average prediction')
    plt.plot(X_grid, y_true_grid, color='blue', linewidth=2, label='True function')
    plt.scatter(X, y, color='black', s=20, label='Data points')
    plt.ylim(-2, 2)
    plt.title(f'Regression Tree (max_depth={max_depth})')
    plt.xlabel('x')
    plt.ylabel('f(x)')
    plt.legend()
    plt.show()
    
    return fold_predictions, avg_prediction, fold_errors

# --- Plot Individual Fold Predictions for Selected Tree Depths ---
_ = plot_tree_fold_predictions(1, X, y, X_grid, y_true_grid, n_splits=5)

Regression Tree

Code
_ = plot_tree_fold_predictions(10, X, y, X_grid, y_true_grid, n_splits=5)

Regression Tree

Code
_ = plot_tree_fold_predictions(3, X, y, X_grid, y_true_grid, n_splits=5)

Bias, Variance, and CV Error

Code
# --- Compute Bias², Variance, and CV Error vs. Tree Depth ---
max_depths = range(1, 8)
bias_list = []
variance_list = []
cv_error_list = []

for depth in max_depths:
    kf = KFold(n_splits=5, shuffle=True, random_state=42)
    fold_predictions = []
    fold_errors = []
    
    for train_index, test_index in kf.split(X):
        X_train = X[train_index]
        X_test = X[test_index]
        
        model = DecisionTreeRegressor(max_depth=depth, random_state=42)
        model.fit(X_train, y[train_index])
        
        y_pred_grid = model.predict(X_grid)
        fold_predictions.append(y_pred_grid)
        
        y_pred_test = model.predict(X_test)
        fold_errors.append(mean_squared_error(y[test_index], y_pred_test))
    
    fold_predictions = np.array(fold_predictions)
    mean_prediction = np.mean(fold_predictions, axis=0)
    
    # Bias²: Mean squared difference between the average prediction and the true function
    bias_sq = np.mean((mean_prediction - y_true_grid)**2)
    # Variance: Average variance of predictions across the evaluation grid
    variance = np.mean(np.var(fold_predictions, axis=0))
    # CV Error: Average test error over folds
    cv_error = np.mean(fold_errors)
    
    bias_list.append(bias_sq)
    variance_list.append(variance)
    cv_error_list.append(cv_error)

plt.figure(figsize=(8, 5))
plt.plot(max_depths, bias_list, marker='o', label='Bias²')
plt.plot(max_depths, variance_list, marker='o', label='Variance')
plt.plot(max_depths, cv_error_list, marker='o', label='CV Error (MSE)')
plt.title('Bias, Variance, and CV Error vs. Regression Tree Depth')
plt.xlabel('Max Depth')
plt.ylabel('Error')
# plt.ylim(0, 1)
plt.legend()
plt.show()

KNN Regression

Code
from sklearn.neighbors import KNeighborsRegressor

def plot_knn_fold_predictions(n_neighbors, X, y, X_grid, y_true_grid, n_splits=5, random_state=42):
    """
    For a given number of neighbors, perform KFold cross-validation using KNeighborsRegressor,
    plot the predictions from each fold along with the average prediction and the true function.
    The y-axis is limited to [-2, 2] for clarity.
    """
    kf = KFold(n_splits=n_splits, shuffle=True, random_state=random_state)
    fold_predictions = []  # Store predictions on the evaluation grid for each fold
    fold_errors = []       # Store test errors for each fold

    for train_index, test_index in kf.split(X):
        X_train = X[train_index]
        X_test = X[test_index]
        
        model = KNeighborsRegressor(n_neighbors=n_neighbors)
        model.fit(X_train, y[train_index])
        
        # Prediction on a dense evaluation grid for bias-variance analysis
        y_pred_grid = model.predict(X_grid)
        fold_predictions.append(y_pred_grid)
        
        # Test error on held-out data
        y_pred_test = model.predict(X_test)
        fold_errors.append(mean_squared_error(y[test_index], y_pred_test))
    
    fold_predictions = np.array(fold_predictions)
    avg_prediction = np.mean(fold_predictions, axis=0)
    
    # Plot individual fold predictions
    plt.figure(figsize=(8, 5))
    for i in range(n_splits):
        plt.plot(X_grid, fold_predictions[i], color='gray', alpha=0.5,
                 label='Fold prediction' if i == 0 else "")
    plt.plot(X_grid, avg_prediction, color='red', linewidth=2, label='Average prediction')
    plt.plot(X_grid, y_true_grid, color='blue', linewidth=2, label='True function')
    plt.scatter(X, y, color='black', s=20, label='Data points')
    plt.ylim(-2, 2)
    plt.title(f'KNN Regression (n_neighbors = {n_neighbors})')
    plt.xlabel('x')
    plt.ylabel('f(x)')
    plt.legend()
    plt.show()
    
    return fold_predictions, avg_prediction, fold_errors

# --- Plot Individual Fold Predictions for Selected Values of k ---
_ = plot_knn_fold_predictions(1, X, y, X_grid, y_true_grid, n_splits=5)

KNN Regression

Code
_ = plot_knn_fold_predictions(10, X, y, X_grid, y_true_grid, n_splits=5)

KNN Regression

Code
_ = plot_knn_fold_predictions(4, X, y, X_grid, y_true_grid, n_splits=5)

Bias, Variance, and CV Error

Code
# --- Compute Bias², Variance, and CV Error vs. Number of Neighbors ---
neighbors_range = range(1, 21)  # Vary k from 1 to 20
bias_list = []
variance_list = []
cv_error_list = []

for k in neighbors_range:
    kf = KFold(n_splits=5, shuffle=True, random_state=42)
    fold_predictions = []
    fold_errors = []
    
    for train_index, test_index in kf.split(X):
        X_train = X[train_index]
        X_test = X[test_index]
        
        model = KNeighborsRegressor(n_neighbors=k)
        model.fit(X_train, y[train_index])
        
        y_pred_grid = model.predict(X_grid)
        fold_predictions.append(y_pred_grid)
        
        y_pred_test = model.predict(X_test)
        fold_errors.append(mean_squared_error(y[test_index], y_pred_test))
    
    fold_predictions = np.array(fold_predictions)
    mean_prediction = np.mean(fold_predictions, axis=0)
    
    # Bias²: Mean squared difference between the average prediction and the true function
    bias_sq = np.mean((mean_prediction - y_true_grid)**2)
    # Variance: Average variance of predictions across the evaluation grid
    variance = np.mean(np.var(fold_predictions, axis=0))
    # CV Error: Average MSE on the held-out test sets
    cv_error = np.mean(fold_errors)
    
    bias_list.append(bias_sq)
    variance_list.append(variance)
    cv_error_list.append(cv_error)

# --- Plot Bias², Variance, and CV Error vs. Number of Neighbors ---
plt.figure(figsize=(8, 5))
plt.plot(neighbors_range, bias_list, marker='o', label='Bias²')
plt.plot(neighbors_range, variance_list, marker='o', label='Variance')
plt.plot(neighbors_range, cv_error_list, marker='o', label='CV Error (MSE)')
plt.title('Bias, Variance, and CV Error vs. Number of Neighbors (KNN)')
plt.xlabel('Number of Neighbors')
plt.ylabel('Error')
# plt.ylim(0, 1)
plt.legend()
plt.show()

Prologue

Summary

  • Evaluated model complexity and its impact on performance.
  • Illustrated underfitting, overfitting, and the bias–variance tradeoff.
  • Demonstrated learning curves and cross-validation across diverse models (linear, polynomial, tree, KNN, deep nets).

Next lecture

  • Machine Learning Engineering

References

Ambroise, Christophe, and Geoffrey J. McLachlan. 2002. Selection bias in gene extraction on the basis of microarray gene-expression data.” Proceedings of the National Academy of Sciences 99 (10): 6562–66. https://doi.org/10.1073/pnas.102102699.
Burkov, A. 2020. Machine Learning Engineering. True Positive Incorporated. https://books.google.ca/books?id=HeXizQEACAAJ.
Burkov, Andriy. 2019. The Hundred-Page Machine Learning Book. Andriy Burkov.
Callaway, Ewen. 2025. Biggest-ever AI biology model writes DNA on demand.” Nature 638 (8052): 868–69. https://doi.org/10.1038/d41586-025-00531-3.
Chollet, François. 2017. Deep Learning with Python. Manning Publications.
Géron, Aurélien. 2022. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow. 3rd ed. O’Reilly Media, Inc.
Libbrecht, Maxwell W, and William Stafford Noble. 2015. Machine learning applications in genetics and genomics. Nature Reviews Genetics 16 (6): 321–32. https://doi.org/10.1038/nrg3920.
Statnikov, Alexander, Constantin F. Aliferis, Ioannis Tsamardinos, Douglas Hardin, and Shawn Levy. 2004. A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis.” Bioinformatics 21 (5): 631–43. https://doi.org/10.1093/bioinformatics/bti033.

Marcel Turcotte

Marcel.Turcotte@uOttawa.ca

School of Electrical Engineering and Computer Science (EECS)

University of Ottawa