Jupyter Notebooks and Google Colab

CSI 4106 - Fall 2025

Marcel Turcotte

Version: Sep 5, 2025 10:19

Preamble

Learning objectives

  • Write and execute a Jupyter Notebook.
  • Execute a Jupyter Notebook on Google Colab.

Requirements

Proficiency in Python is expected.

For those needing a refresher, the official tutorial on Python.org is a good place to start.

Simultaneously enhance your skills by creating a Jupyter Notebook that incorporates examples and notes from the tutorial.

Other resources include:

Jupyter Notebooks

A notebook is a shareable document that combines computer code, plain language descriptions, data, rich visualizations like 3D models, charts, graphs and figures, and interactive controls. A notebook, along with an editor (like JupyterLab), provides a fast interactive environment for prototyping and explaining code, exploring and visualizing data, and sharing ideas with others.

Quick Start

Running Jupyter on your computer

Assuming the notebook is in the current directory, execute the following command from the terminal.

jupyter notebook 01_ottawa_river_temperature.ipynb

Similarly, to create a new notebook from scratch,

jupyter notebook

Why?

  • Ease of Use: The interface is intuitive and conducive to exploratory analysis.

  • Visualization: The capability to embed rich, interactive visualizations directly within the notebook enhances its utility for data analysis and presentation.

  • Reproducibility: Jupyter Notebooks have become the de facto standard in many domains for demonstrating code functionality and ensuring reproducibility.

How?

Installing Jupyter (1/2)

These instructions use pip, the recommended installation tool for Python.

The initial step is to verify that you have a functioning Python installation with pip installed.

$ python --version 
Python 3.10.14
$ pip --version
pip 24.2
C:> py --version 
Python 3.10.14
C:> py -m pip --version
pip 24.2

Installing Jupyter (2/2)

Installing JupyterLab with pip:

$ pip install jupyterlab

Once installed, run JupyterLab with:

$ jupyter lab

Sample Jupyter Notebooks

Missing libraries

Launching 03_missing_library in Colab.

04_stock_price

Launching 04_stock_price in Colab.

05_central_limit

Launching 05_central_limit in Colab.

Lecture Notes

Each lecture is provided as a Jupyter Notebook.

import numpy as np
import matplotlib.pyplot as plt

# Sigmoid function
def sigmoid(t):
    return 1 / (1 + np.exp(-t))

# Generate x values
t = np.linspace(-6, 6, 400)

# Compute y values for the sigmoid function
y = sigmoid(t)

# Create a figure and remove axes and grid
fig, ax = plt.subplots()
ax.plot(t, y, color='black', linewidth=2)  # Keep the curve opaque

plt.grid(True)

# Set transparent background for the figure and axes
fig.patch.set_alpha(0)  # Transparent background for the figure

plt.show()

Lecture Notes

Prologue

Summary

  • Introducing the tools, specifically Jupyter Notebooks and Google Colab.

Resources

References

Appendix: Version Control

Version Control (GitHub)

By default, Jupyter Notebooks store the outputs of code cells, including media objects.

Jupyter Notebooks are JSON documents, and images within them are encoded in PNG base64 format.

This encoding can lead to several issues when using version control systems, such as GitHub.

  • Large File Sizes: Jupyter Notebooks can become quite large due to embedded images and outputs, leading to prolonged upload times and potential storage constraints.
  • Incompatibility with Text-Based Version Control: GitHub is optimized for text-based files, and the inclusion of binary data, such as images, complicates the process of tracking changes and resolving conflicts. Traditional diff and merge operations are not well-suited for handling these binary formats.

Version Control (GitHub) - solutions

  1. In JupyterLab or Notebook, Edit \(\rightarrow\) Clear Outputs of All Cells, then save.
  2. On the command line, use jupyter nbconvert --clear-output
jupyter nbconvert --clear-output --inplace 04_stock_price.ipynb

or

jupyter nbconvert 04_stock_price.ipynb --to notebook --ClearOutputPreprocessor.enabled=True --output 04_stock_price_clear
  1. Use nbdime, specialized for Jupyter Notebooks.

Appendix: environment management

Environment management

Important

Do not attempt to install these tools unless you are confident in your technical skills. An incorrect installation could waste significant time or even render your environment unusable. There is nothing wrong with using pip or Google Colab for your coursework. You can develop these installation skills later without impacting your grades.

Package management

  • Managing package dependencies can be complex.
    • A package manager addresses these challenges.
  • Different projects may require different versions of the same libraries.
    • Package management tools, such as conda, facilitate the creation of virtual environments tailored to specific projects.

Anaconda

Anaconda is a comprehensive package management platform for Python and R. It utilizes Conda to manage packages, dependencies, and environments.

  • Anaconda is advantageous as it comes pre-installed with over 250 popular packages, providing a robust starting point for users.

  • However, this extensive distribution results in a large file size, which can be a drawback.

  • Additionally, since Anaconda relies on conda, it also inherits the limitations and issues associated with conda (see subsequent slides).

Miniconda

Miniconda is a minimal version of Anaconda that includes only conda, Python, their dependencies, and a small selection of essential packages.

Conda

Conda is an open-source package and environment management system for Python and R. It facilitates the installation and management of software packages and the creation of isolated virtual environments.

  • Dependency conflicts due to complex package interdependencies can force the user reinstall Anaconda/Conda.

  • Plague with large storage requirements and performance issues during package resolution.

Mamba

Mamba is a reimplementation of the conda package manager in C++.

  • It is significantly faster than conda.
  • It consumes fewer computational resources.
  • It provides clearer and more informative error messages.
  • It is fully compatible with conda, making it a viable replacement.

Micromamba is a fully statically-linked, self-contained executable. Its empty base environment ensures that the base is never corrupted, eliminating the need for reinstallation.

Further your education

Marcel Turcotte

Marcel.Turcotte@uOttawa.ca

School of Electrical Engineering and Computer Science (EECS)

University of Ottawa