Foundations to AI

CSI 4106 - Fall 2024

Marcel Turcotte

Version: Sep 9, 2024 08:36

Preamble

Quote of the day

We are assembling a lean, cracked team of the world’s best engineers and researchers dedicated to focusing on SSI (safe superintelligence) and nothing else.

Understanding the history of artificial intelligence is crucial, especially now as we find ourselves at the peak of speculative enthusiasm, with widespread claims that the era of general artificial intelligence is imminent.

Learning objectives

  • Recognize the contributions of other disciplines to AI.
  • Situate current AI within its historical context.
  • Introducing some of the tools, namely Jupyter Notebooks.

Schools (from the first lecture)

  • Symbolic AI (includes approaches based on logic)
  • Connectionists (mostly neural networks)

Foundations of Artificial Intelligence

Philosophy

Aristotle (384-322 BC) laid several foundational concepts for AI, including an informal system of syllogisms that facilitates proper reasoning by mechanically deriving conclusions from given premises.

Philosophy (continued)

Utilitarianism is an ethical theory that emphasizes the greatest good for the greatest number.

  1. Utilitarianism offers AI a decision-making framework by prioritizing actions that enhance collective well-being.
  2. It guides the ethical design of AI to ensure technology promotes societal welfare.
  3. Utilitarianism directs efficient resource distribution in AI, targeting maximal positive impact, especially in sectors like healthcare.
  4. It shapes policy and regulation to maximize societal benefits and minimize AI-related harms.
  5. Utilitarian principles assist in balancing AI’s benefits against risks for a net positive outcome.

Mathematics – formal logic

  • George Boole (1815–1864) is credited with the mathematization of logic through the development of propositional logic, also referred to as boolean logic.
  • Gottlob Frege (1848–1925) extended Boole’s logical framework by incorporating objects and relations, thereby developing what is now known as first-order logic.
  • The contributions of Kurt Gödel (1906–1978), Alonzo Church (1903–1995), and Alan Turing (1912–1954), among others, have been instrumental in shaping the modern concept of computation.

Mathematics – probability

  • Gerolamo Cardano (1501–1576) initially conceptualized probability through gambling outcomes.
  • Blaise Pascal (1623–1662) outlined methods in 1654 for calculating predictions and average payoffs in unfinished gambling games in correspondence with Pierre Fermat (1601–1665).
  • Thomas Bayes (1702–1761) introduced a method for revising probabilities with new evidence, known as Bayes’ rule, vital for AI applications.

Mathematics – algorithms

  • Complex algorithms have their origins with Euclid around 300 BC, while the term “algorithm” itself is derived from the work of Muhammad ibn Musa al-Khwarizmi in the 9th century.

  • The Church-Turing thesis posits that any computation that can be performed by a mechanical process can be computed by a Turing machine, essentially equating the concept of algorithmic computation with the capabilities of Turing machines (Church 1936; Alan M. Turing 1936).

  • The concept of NP-completeness, introduced by Cook and further developed by Karp, establishes a framework for evaluating the tractability of computational problems. (Cook 1971; Karp 1972)

Neuroscience

Today, it is universally acknowledged that cognitive functions emerge from the electrochemical activities within these brain structures, illustrating how assemblies of simple cells can give rise to thought, action, and consciousness.

Neuroscience (continued)

  • “Of all the animals, man has the largest brain in proportion to his size.” Aristotle, 335 BC.
  • Paul Broca’s research in 1861 marked the beginning of understanding the brain’s functional organization, notably identifying the left hemisphere’s role in speech production.

Neuroscience (continued)

Large-scale collaborative studies have provided us with extensive data encompassing the anatomy, cell types, connectivity, and gene expression profiles of the brain (Maroso 2023; Conroy 2023).

Neuroscience – neuron

Computers vs human brain

Supercomputer Personal Computer Human Brain
Processing units \(10^6\) CPU+GPU cores 8 CPU cores \(10^6\) columns
\(10^{15}\) transistors \(10^{15}\) transistors \(10^{11}\) neurons
\(10^{14}\) synapses
Cycle time \(10^{-9}\) sec \(10^{-9}\) sec \(10^{-3}\) sec
Operations/sec \(10^{18}\) \(10^{10}\) \(10^{17}\)

Computers vs human brain (contd)

  • “By the end of 2024, we’re aiming to continue to grow our infrastructure build-out that will include 350,000 NVIDIA H100 GPUs as part of a portfolio that will feature compute power equivalent to nearly 600,000 H100s.”
    • Building Meta’s GenAI Infrastructure, March 12,2024.
    • Each H100 has 80 billion (\(8 \times 10^{10}\)) transistors.
    • 600,000 H100s implies a total of \(4.8 \times 10^{16}\) transistors.
    • Each chip carries a price tag of $40,000 USD!
    • $24,000,000,000 (24 billion) USD infrastructure.
      • Similar to Iceland’s Gross Domestic Product (GDP).

Computers vs human brain (contd)

  • “By combining this data, de Vries calculates that by 2027 the AI sector could consume between 85 to 134 terawatt hours each year. That’s about the same as the annual energy demand of de Vries’ home country, the Netherlands.”

Psychology

If the organism carries a “small-scale model” of external reality and of its own possible actions within its head, it is able to try out various alternatives, conclude which is the best of them, react to future situations before they arise, utilize the knowledge of past events in dealing with the present and future, and in every way to react in a much fuller, safer, and more competent manner to the emergencies which face it.

  • Cognitive psychology conceptualizes the brain as an information-processing device.

  • Knowledge-based agents are conceptualized as receiving inputs (percepts) from their environment, having an internal state, and producting actions (outputs).

Cognitive science

In the same year that the term “artificial intelligence” was introduced, cognitive science emerged as a discipline.

1956 MIT workshop:

Three foundational papers demonstrated how computer models can be applied to the psychology of memory, language, and logical reasoning.

Artificial Intelligence: A Timeline

1943–1974

1950 – Turing test

The Turing Test is a measure of a machine’s ability to exhibit intelligent behavior indistinguishable from that of a human.

If a human evaluator cannot reliably distinguish between a machine and a human based solely on their responses to questions, the machine is said to have passed the test.

1950 – Turing test (in 2024)

It’s likely that the Turing Test will become yet another casualty of our shifting conceptions of intelligence. In 1950, Turing intuited that the ability for human-like conversation should be firm evidence of “thinking,” and all that goes with it. That intuition is still strong today. But perhaps what we have learned from ELIZA and Eugene Goostman, and what we may still learn from ChatGPT and its ilk, is that the ability to sound fluent in natural language, like playing chess, is not conclusive proof of general intelligence.

1943 – First artificial neural network

Warren S. McCulloch & Walter Pitts 1943

  • Propositional Logic and Neural Events: The “all-or-none” nature of nervous activity allows neural events and their relationships to be treated using propositional logic.
  • Implications for Psychology and Neurophysiology: The theory provides a rigorous framework for understanding mental activities in terms of neurophysiology, offering insights into the causal relationships and the construction of hypothetical neural nets.
  • Learning: we regard \((\ldots)\) learning as an enduring change which can survive sleep, anaesthesia, convulsions and coma.

1949 – First artificial neural network

In 1949, Donald Hebb introduced a straightforward updating rule for adjusting the connection strengths between neurons.

Hebbian learning is a learning mechanism in which the synaptic strength between two neurons is increased if they are activated simultaneously. This principle is often summarized as “cells that fire together, wire together,” and it forms the basis for understanding how neural connections are reinforced through experience.

1950 – First artificial neural network

  • In 1950, while an undergraduate at Harvard, Marvin Minsky, in collaboration with Dean Edmonds, constructed the first artificial neural network computer, which simulated the functionality of 40 neurons.
  • In 1954, for his doctoral thesis in mathematics at Princeton University, Minsky conducted an in-depth investigation into the principle of universal computation within neural networks.

First artificial neural network

1956 – Founding event

Dartmouth Summer Research Project on Artificial Intelligence

We propose that a 2-month, 10-man study of artificial intelligence be carried out during the summer of 1956 at Dartmouth College in Hanover, New Hampshire. The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it. An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves. We think that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer.

1956 – Logic Theorist

Russell and Norvig (2020)

Newell and Simon presented perhaps the most mature work, a mathematical theorem-proving system called the Logic Theorist (LT). Simon claimed, ‘We have invented a computer program capable of thinking non-numerically, and thereby solved the venerable mind–body problem.’

1959 – Machine Learning

Arthur Samuel’s work on machine learning using the game of checkers has had a profound impact on the field of artificial intelligence (AI) and computer science at large.

  • One of the earliest examples of a self-improving AI system.
  • His contributions helped to establish machine learning as a critical sub-field of AI.
  • Samuel defines machine learning as a “field of study that gives computers the ability to learn without being explicitly programmed”.
  • Precursor to reinforcement learning and AlphaGo.

1952 - IBM 701

  • 16,000 instructions per second
  • 8.75 kilobytes of memory

Hype

1957 Herbert Simon

It is not my aim to surprise or shock you—but the simplest way I can summarize is to say that there are now in the world machines that think, that learn and that create. Moreover, their ability to do these things is going to increase rapidly until—in a visible future—the range of problems they can handle will be coextensive with the range to which the human mind has been applied.

1958, New York Times, July 8

The Navy revealed the embryo of an electronic computer today that it expects will be able to walk, talk, see, write, reproduce itself, and be conscious of its existence.

1965 Herbert Simon (Mitchell 2019)

(\(\ldots\)) machines will be capable, within 20 years, of doing any work that a man can do.

1966 Marvin Minsky (Mitchell 2019)

(\(\ldots\)) in which they would assign undergraduates to work on “the construction of a significant part of a visual system.” In the words of one AI historian, “Minsky hired a first-year undergraduate and assigned him a problem to solve over the summer: connect a television camera to a computer and get the machine to describe what it sees.”

1967 Marvin Minsky (Strickland 2021)

Within a generation\(\ldots\) the problem of creating ‘artificial intelligence’ will be substantially solved.

1974–1980

Symbolic AI

In 1976, Newell and Simon, the authors of Logic Theorist (LT) create the General Problem Solver (GPS) meant to emulate how human solve problems.

Allen Newell and Simon (1976)

a physical symbol system has the necessary and sufficient means for general intelligent action.

First AI winter.

Funding dried up.

Russell and Norvig (2020)

Failure to come to grips with the “combinatorial explosion” was one of the main criticisms of AI contained in the Lighthill report (Lighthill, 1973), which formed the basis for the decision by the British government to end support for AI research in all but two universities.

Fundamental limitations: what could be represented. Linearly separable data, for instance.

1980–1987

Expert systems

Expert systems are programs that emulate the decision-making abilities of a human expert by using a knowledge base and inference rules (typically, if-then rules) to solve complex problems within a specific domain.

  • 1984, Douglas Lenat began work on Cyc, with the aim to encode human common sense. By 2017, Cyc had 1.5 million terms and 24.5 million rules.

Expert systems - if-then rules

Rule 1: - IF the patient has a fever AND the patient has a sore throat, - THEN consider the possibility of a streptococcal infection.

Rule 2: - IF the patient has a rash AND the patient has been in a wooded area recently, - THEN consider the possibility of Lyme disease.

Rule 3: - IF the patient is experiencing chest pain AND the patient has a history of heart disease, - THEN consider the possibility of a myocardial infarction (heart attack).

1987–1993

Second AI winter

Strickland (2021)

By the 1990s, it was no longer academically fashionable to be working on either symbolic AI or neural networks, because both strategies seemed to have flopped.

1993–2011

Support Vector Machine (SVM)

  • A Support Vector Machine (SVM) is a supervised machine learning algorithm.
  • It operates by identifying the optimal hyperplane that separates data into distinct classes within a high-dimensional space.
  • Grounded in the robust theoretical framework of Vapnik-Chervonenkis (VC) theory.
  • Influential and dominant during the 1990s and 2000s.

2011–

Deep learning

In 2012, AlexNet, a convolutional neural network (CNN) architecture inspired by Yann LeCun’s work, wins the ImageNet Large Scale Visual Recognition Challenge.

This marked a pivotal moment in the field, as subsequently, all leading entries in the competition have been founded on deep learning methodologies.

Winter is coming?

Summary

See also:

Tutorial

Requirements

Proficiency in Python is expected.

For those needing a refresher, the official tutorial on Python.org is a good place to start.

Simultaneously enhance your skills by creating a Jupyter Notebook that incorporates examples and notes from the tutorial.

Other resources include:

Jupyter Notebooks

A notebook is a shareable document that combines computer code, plain language descriptions, data, rich visualizations like 3D models, charts, graphs and figures, and interactive controls. A notebook, along with an editor (like JupyterLab), provides a fast interactive environment for prototyping and explaining code, exploring and visualizing data, and sharing ideas with others.

Quick Start

Running Jupyter on your computer

Assuming the notebook is in the current directory, execute the following command from the terminal.

jupyter notebook 01_ottawa_river_temperature.ipynb

Similarly, to create a new notebook from scratch,

jupyter notebook

Why?

  • Ease of Use: The interface is intuitive and conducive to exploratory analysis.

  • Visualization: The capability to embed rich, interactive visualizations directly within the notebook enhances its utility for data analysis and presentation.

  • Reproducibility: Jupyter Notebooks have become the de facto standard in many domains for demonstrating code functionality and ensuring reproducibility.

How?

Version Control (GitHub)

By default, Jupyter Notebooks store the outputs of code cells, including media objects.

Jupyter Notebooks are JSON documents, and images within them are encoded in PNG base64 format.

This encoding can lead to several issues when using version control systems, such as GitHub.

  • Large File Sizes: Jupyter Notebooks can become quite large due to embedded images and outputs, leading to prolonged upload times and potential storage constraints.
  • Incompatibility with Text-Based Version Control: GitHub is optimized for text-based files, and the inclusion of binary data, such as images, complicates the process of tracking changes and resolving conflicts. Traditional diff and merge operations are not well-suited for handling these binary formats.

Version Control (GitHub) - solutions

  1. In JupyterLab or Notebook, Edit \(\rightarrow\) Clear Outputs of All Cells, then save.
  2. On the command line, use jupyter nbconvert --clear-output
jupyter nbconvert --clear-output --inplace 04_stock_price.ipynb

or

jupyter nbconvert 04_stock_price.ipynb --to notebook --ClearOutputPreprocessor.enabled=True --output 04_stock_price_clear
  1. Use nbdime, specialized for Jupyter Notebooks.

Installing Jupyter (1/2)

These instructions use pip, the recommended installation tool for Python.

The initial step is to verify that you have a functioning Python installation with pip installed.

$ python --version 
Python 3.10.14
$ pip --version
pip 24.2
C:> py --version 
Python 3.10.14
C:> py -m pip --version
pip 24.2

Installing Jupyter (2/2)

Installing JupyterLab with pip:

$ pip install jupyterlab

Once installed, run JupyterLab with:

$ jupyter lab

Sample Jupyter Notebooks

Missing libraries

Launching 03_get_youtube_transcript in Colab.

04_stock_price

Launching 04_stock_price in Colab.

05_central_limit

Launching 05_central_limit in Colab.

Prologue

Summary

  • Situate current AI within its historical context.
  • Introducing the tools, specifically Jupyter Notebooks.

Next lecture

  • Introduction to machine learning

One of my favourite ML books

Resources

References

Boser, Bernhard E., Isabelle Guyon, and Vladimir Vapnik. 1992. “A Training Algorithm for Optimal Margin Classifiers.” In COLT, 144–52. ACM. https://doi.org/10.1145/130385.130401.
Chomsky, Noam. 1956. “Three Models for the Description of Language.” IRE Transactions on Information Theory 2: 113–24.
Church, Alonzo. 1936. “An Unsolvable Problem of Elementary Number Theory.” American Journal of Mathematics 58 (2): 345–63. https://doi.org/10.2307/2371045.
Conroy, Gemma. 2023. This is the largest map of the human brain ever made.” Nature 622 (7984): 679–80. https://doi.org/10.1038/d41586-023-03192-2.
Cook, Stephen A. 1971. “The Complexity of Theorem-Proving Procedures.” In Proceedings of the Third Annual ACM Symposium on Theory of Computing, 151–58. STOC ’71. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/800157.805047.
Craik, K. J. W. 1943. The Nature of Explanation. Cambridge University Press. https://books.google.ca/books?id=EN0TrgEACAAJ.
Géron, Aurélien. 2022. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow. 3rd ed. O’Reilly Media, Inc.
Hume, David. (1739) 1739. A Treatise of Human Nature. Edited by L. A. Selby-Bigge. Oxford: Oxford University Press.
Karp, Richard M. 1972. “Reducibility Among Combinatorial Problems.” In Complexity of Computer Computations, edited by Raymond E. Miller and James W. Thatcher, 85–103. The IBM Research Symposia Series. Plenum Press, New York. http://dblp.uni-trier.de/db/conf/coco/cocc1972.html.
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E Hinton. 2012. “ImageNet Classification with Deep Convolutional Neural Networks.” In Advances in Neural Information Processing Systems, edited by F. Pereira, C. J. Burges, L. Bottou, and K. Q. Weinberger. Vol. 25. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf.
Maroso, Mattia. 2023. “A Quest into the Human Brain.” Science 382 (6667): 166–67. https://doi.org/10.1126/science.adl0913.
McCorduck, Pamela. 2004. Machines Who Think, A Personal Inquiry into the History and Prospects of Artificial Intelligence. Taylor & Francis Group, LLC. https://doi.org/10.1201/9780429258985.
McCulloch, Warren S., and Walter Pitts. 1943. “A Logical Calculus of the Ideas Immanent in Nervous Activity.” The Bulletin of Mathematical Biophysics 5 (4): 115–33. https://doi.org/10.1007/BF02478259.
Miller, George A. 1956. “The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information.” The Psychological Review 63 (2): 81–97.
Mitchell, Melanie. 2019. Artificial Intelligence: A Guide for Thinking Humans. New York, NY, USA: Farrar, Straus; Giroux.
———. 2024. “The Turing Test and Our Shifting Conceptions of Intelligence.” Science 385 (6710): eadq9356. https://doi.org/10.1126/science.adq9356.
Newell, Allen, and Herbert A. Simon. 1976. “Computer Science as Empirical Inquiry: Symbols and Search.” Commun. ACM 19 (3): 113–26. https://doi.org/10.1145/360018.360022.
Newell, A., and H. Simon. 1956. “The Logic Theory Machine–a Complex Information Processing System.” IRE Transactions on Information Theory 2 (3): 61–79. https://doi.org/10.1109/TIT.1956.1056797.
Russell, Stuart, and Peter Norvig. 2020. Artificial Intelligence: A Modern Approach. 4th ed. Pearson. http://aima.cs.berkeley.edu/.
Samuel, A. L. 1959. “Some Studies in Machine Learning Using the Game of Checkers.” IBM J. Res. Dev. 3 (3): 210–29. https://doi.org/10.1147/rd.33.0210.
Strickland, Eliza. 2021. “The Turbulent Past and Uncertain Future of AI: Is There a Way Out of AI’s Boom-and-Bust Cycle?” IEEE Spectrum 58 (10): 26–31. https://doi.org/10.1109/MSPEC.2021.9563956.
Turing, A M. 1950. Computing machinery and intelligence.” Mind 59: 433–60.
Turing, Alan M. 1936. “On Computable Numbers, with an Application to the Entscheidungsproblem.” Proceedings of the London Mathematical Society 2 (42): 230–65.

:::

Appendix: environment management

Environment management

Important

Do not attempt to install these tools unless you are confident in your technical skills. An incorrect installation could waste significant time or even render your environment unusable. There is nothing wrong with using pip or Google Colab for your coursework. You can develop these installation skills later without impacting your grades.

Package management

  • Managing package dependencies can be complex.
    • A package manager addresses these challenges.
  • Different projects may require different versions of the same libraries.
    • Package management tools, such as conda, facilitate the creation of virtual environments tailored to specific projects.

Anaconda

Anaconda is a comprehensive package management platform for Python and R. It utilizes Conda to manage packages, dependencies, and environments.

  • Anaconda is advantageous as it comes pre-installed with over 250 popular packages, providing a robust starting point for users.

  • However, this extensive distribution results in a large file size, which can be a drawback.

  • Additionally, since Anaconda relies on conda, it also inherits the limitations and issues associated with conda (see subsequent slides).

Miniconda

Miniconda is a minimal version of Anaconda that includes only conda, Python, their dependencies, and a small selection of essential packages.

Conda

Conda is an open-source package and environment management system for Python and R. It facilitates the installation and management of software packages and the creation of isolated virtual environments.

  • Dependency conflicts due to complex package interdependencies can force the user reinstall Anaconda/Conda.

  • Plague with large storage requirements and performance issues during package resolution.

Mamba

Mamba is a reimplementation of the conda package manager in C++.

  • It is significantly faster than conda.
  • It consumes fewer computational resources.
  • It provides clearer and more informative error messages.
  • It is fully compatible with conda, making it a viable replacement.

Micromamba is a fully statically-linked, self-contained executable. Its empty base environment ensures that the base is never corrupted, eliminating the need for reinstallation.

Marcel Turcotte

Marcel.Turcotte@uOttawa.ca

School of Electrical Engineering and Computer Science (EECS)

University of Ottawa