Introduction to Deep Learning

CSI 5180 - Machine Learning for Bioinformatics

Author

Marcel Turcotte

Published

Version: Mar 10, 2025 16:45

Preamble

Quote of the Day

;document.getElementById("tweet-49362").innerHTML = tweet["html"];

Summary

Neural networks evolved from simple, biologically inspired perceptrons to deep, multilayer architectures that rely on nonlinear activation functions for learning complex patterns. The universal approximation theorem underpins their ability to approximate any continuous function, and modern frameworks like PyTorch, TensorFlow, and Keras enable practical deep learning applications.

Learning Objectives

  • Explain basic neural network models (perceptrons and MLPs) and their computational foundations.
  • Appreciate the limitations of single-layer networks and the necessity for hidden layers.
  • Describe the role and impact of nonlinear activation functions (sigmoid, tanh, ReLU) in learning.
  • Articulate the universal approximation theorem and its significance.
  • Implement and evaluate deep learning models using modern frameworks such as TensorFlow and Keras.

Introduction

TensorFlow Playground

Primer on Deep Learning in Genomics

Zou et al. (2019), Figure 2

Transcription Factors

Gene Regulation

Wasserman and Sandelin (2004)

DNA Sequence Motif Discovery

D’haeseleer (2006)

Deep Learning in Genomics Primer

  • James Zou, Mikael Huss, Abubakar Abid, Pejman Mohammadi, Ali Torkamani, and Amalio Telenti, A primer on deep learning in genomics, Nat Genet 51:1, 12–18, 2019.

Zou et al. (2019)

Machine Learning Problems

  • Supervised Learning: Classification, Regression

  • Unsupervised Learning: Autoencoders, Self-Supervised

  • Reinforcement Learning: Now an Integral Component

We will begin our exploration within the framework of supervised learning.

A Neuron

Attribution: Jennifer Walinga, CC BY-SA 4.0

In the study of artificial intelligence, it is logical to derive inspiration from the most well-understood form of intelligence: the human brain. The brain is composed of a complex network of neurons, which together form biological neural networks. Although each neuron exhibits relatively simple behavior, it is connected to thousands of other neurons, contributing to the intricate functionality of these networks.

A neuron can be conceptualized as a basic computational unit, and the complexity of brain function arises from the interconnectedness of these units.

Yann LeCun and other researchers have frequently noted that artificial neural networks used in machine learning resemble biological neural networks in much the same way that an airplane’s wings resemble those of a bird.

Connectionist

Attribution: LeNail, (2019). NN-SVG: Publication-Ready Neural Network Architecture Schematics. Journal of Open Source Software, 4(33), 747, https://doi.org/10.21105/joss.00747 (GitHub)

A characteristic of biological neural networks that we adopt is the organization of neurons into layers, particularly evident in the cerebral cortex.

Neural networks (NNs) consist of layers of interconnected nodes (neurons), each connection having an associated weight.

Neural networks process input data through these weighted connections, and learning occurs by adjusting the weights based on errors in the training data.

Hierarchy of Concepts

Attribution: LeCun, Bengio, and Hinton (2015)

In the book “Deep Learning” (Goodfellow, Bengio, and Courville 2016), authors Goodfellow, Bengio, and Courville define deep learning as a subset of machine learning that enables computers to “understand the world in terms of a hierarchy of concepts.”

This hierarchical approach is one of deep learning’s most significant contributions. It reduces the need for manual feature engineering and redirects the focus toward the engineering of neural network architectures.

Basics

Computations with Neurodes

where \(x_1, x_2 \in \{0,1\}\) and \(f(z)\) is an indicator function:

\[ f(z)= \begin{cases}0, & z<\theta \\ 1, & z \geq \theta\end{cases} \]

McCulloch and Pitts (1943) termed artificial neurons, neurodes, for “neuron” + “node”.

In mathematics, \(f(z)\), as defined above, is known as an indicator function or a characteristic function.

These neurodes have one or more binary inputs, taking a value of 0 or 1, and one binary output.

They showed that such units could implement Boolean functions such as AND, OR, and NOT.

But also that networks of such units can compute any logical proposition.

Computations with Neurodes

\[ y = f(x_1 + x_2)= \begin{cases}0, & x_1 + x_2 <\theta \\ 1, & x_1 + x_2 \geq \theta\end{cases} \]

  • With \(\theta = 2\), the neurode implements an AND logic gate.

  • With \(\theta = 1\), the neurode implements an OR logic gate.

More complex logic can be constructed by multiplying the inputs by -1, which is interpreted as inhibitory. Namely, this allows building a logical NOT.

With \(\theta = 1\), $x_1 {1} and \(x_2\) multiplied by (-1), \(y = 0\) when \(x_2 = 1\), \(y = 1\), if \(x_2 = 0\).

\[ y = f(x_1 + (-1) x_2)= \begin{cases}0, & x_1 + x_2 <\theta \\ 1, & x_1 + (-1) x_2 \geq \theta\end{cases} \]

Neurons can be broadly categorized into two primary types: excitatory and inhibitory.

Computations with Neurodes

  • Digital computations can be broken down into a sequence of logical operations, enabling neurode networks to execute any computation.

  • McCulloch and Pitts (1943) did not focus on learning parameter \(\theta\).

  • They introduced a machine that computes any function but cannot learn.

The period roughly from 1930 to 1950 marked a transformative shift in mathematics toward the formalization of computation. Pioneering work by Gödel, Church, and Turing not only established the theoretical limits and capabilities of computation—with Gödel’s incompleteness theorems, Church’s λ‑calculus and thesis, and Turing’s model of universal machines—but also set the stage for later developments in computer science.

McCulloch and Pitts’ 1943 model of neural networks was inspired by this early mathematical framework linking computation to aspects of intelligence, prefiguring later research in artificial intelligence.

From this work, we take the idea that networks of such units perform computations. Signal propagates from one end of the network to compute a result.

Threshold Logic Unit

Rosenblatt (1958)

In 1957, Frank Rosenblatt developed a conceptually distinct model of a neuron known as the threshold logic unit, which he published in 1958.

In this model, both the inputs and the output of the neuron are represented as real values. Notably, each input connection has an associated weight.

The left section of the neuron, denoted by the sigma symbol, represents the computation of a weighted sum of its inputs, expressed as \(\theta_1 x_1 + \theta_2 x_2 + \ldots + \theta_D x_D + b\).

This sum is then processed through a step function, right section of the neuron, to generate the output.

Here, \(x^T \theta\) represents the dot product of two vectors: \(x\) and \(\theta\). Here, \(x^T\) denotes the transpose of the vector \(x\), converting it from a row vector to a column vector, allowing the dot product operation to be performed with the vector \(\theta\).

The dot product \(x^T \theta\) is then a scalar given by:

\[ x^T \theta = x^{(1)} \theta_1 + x^{(2)} \theta_2 + \cdots + x_{(D)} \theta_D \]

where \(x^{(j)}\) and \(theta_j\) are the components of the vectors \(x\) and \(\theta\), respectively.

Simple Step Functions

\(\text{heaviside}(t)\) =

  • 1, if \(t \geq 0\)

  • 0, if \(t < 0\)

\(\text{sign}(t)\) =

  • 1, if \(t > 0\)

  • 0, if \(t = 0\)

  • -1, if \(t < 0\)

Common step functions include the heavyside function (0 if the input is negative and 1 otherwise) or the sign function (-1 if the input is negative, 0 if the input is zero, 1 otherwise).

Notation

Add an extra feature with a fixed value of 1 to the input. Associate it with weight \(b = \theta_{0}\), where \(b\) is the bias/intercept term.

Notation

\(\theta_{0} = b\) is the bias/intercept term.

The threshold logic unit is analogous to logistic regression, with the primary distinction being the substitution of the logistic (sigmoid) function with a step function. Similar to logistic regression, the perceptron is employed for classification tasks.

Perceptron

A perceptron consists of one or more threshold logic units arranged in a single layer, with each unit connected to all inputs. This configuration is referred to as fully connected or dense.

Since the threshold logic units in this single layer also generate the output, it is referred to as the output layer.

Perceptron

As this perceptron generates multiple outputs simultaneously, it performs multiple binary predictions, making it as a multilabel classifier (can also be used as multiclass classifier).

Classification tasks, can be further divided into multilabel and multiclass classification.

  1. Multiclass Classification:

    • In multiclass classification, each instance is assigned to one and only one class out of a set of three or more possible classes. The classes are mutually exclusive, meaning that an instance cannot belong to more than one class at the same time.

    • Example: Classifying an image as either a cat, dog, or bird. Each image can only belong to one of these categories.

  2. Multilabel Classification:

    • In multilabel classification, each instance can be associated with multiple classes simultaneously. The classes are not mutually exclusive, allowing for the possibility that an instance can belong to several classes at once.

    • Example: Tagging an image with multiple attributes such as “outdoor,” “sunset,” and “beach.” The image can simultaneously belong to all these labels.

The key difference lies in the relationship between classes: multiclass classification deals with a single label per instance, while multilabel classification handles multiple labels for each instance.

Notation

As before, introduce an additional feature with a value of 1 to the input. Assign a bias \(b\) to each neuron. Each incoming connection implicitly has an associated weight.

Notation

  • \(X\) is the input data matrix where each row corresponds to an example and each column represents one of the \(D\) features.

  • \(W\) is the weight matrix, structured with one row per input (feature) and one column per neuron.

  • Bias terms can be represented separately; both approaches appear in the literature. Here, \(b\) is a vector with a length equal to the number of neurons.

With neural networks, the parameters of the model are often reffered to as \(w\) (vector) or \(W\) (matrix), rather than \(\theta\).

Discussion

  • The algorithm to train the perceptron closely resembles stochastic gradient descent.

    • In the interest of time and to avoid confusion, we will skip this algorithm and focus on multilayer perception (MLP) and its training algorithm, backpropagation.

Historical Note and Justification

Minsky and Papert (1969) demonstrated the limitations of perceptrons, notably their inability to solve exclusive OR (XOR) classification problems: \({([0,1],\mathrm{true}), ([1,0],\mathrm{true}), ([0,0],\mathrm{false}), ([1,1],\mathrm{false})}\).

This limitation also applies to other linear classifiers, such as logistic regression.

Consequently, due to these limitations and a lack of practical applications, some researchers abandoned the perceptron.

Multilayer Perceptron

A multilayer perceptron (MLP) includes an input layer and one or more layers of threshold logic units. Layers that are neither input nor output are termed hidden layers.

XOR Classification Problem

\(x^{(1)}\) \(x^{(2)}\) \(y\) \(o_1\) \(o_2\) \(o_3\)
1 0 1 0 1 1
0 1 1 0 1 1
0 0 0 0 0 0
1 1 0 1 1 0

\(x^{(1)}\) and \(x^{(2)}\) are two attributes, \(y\) is the target, \(o_1\), \(o_2\), and \(o_3 = h_\theta(x)\), are the output of the top left, bottom left, and right threshold units. Clearly \(h_\theta(x) = y, \forall x \in X\). The challenge during Rosenblatt’s time was the lack of algorithms to train multi-layer networks.

I developed an Excel spreadsheet to verify that the proposed multilayer perceptron effectively solves the XOR classification problem.

The step function used in the above model is the heavyside function.

Feedforward Neural Network (FNN)

Information in this architecture flows unidirectionally—from left to right, moving from input to output. Consequently, it is termed a feedforward neural network.

The network consists of three layers: input, hidden, and output. The input layer contains two nodes, the hidden layer comprises three nodes, and the output layer has two nodes. Additional hidden layers and nodes per layer can be added, which will be discussed later.

It is often useful to include explicit input nodes that do not perform calculations, known as input units or input neurons. These nodes act as placeholders to introduce input features into the network, passing data directly to the next layer without transformation. In the network diagram, these are the light blue nodes on the left, labeled 1 and 2. Typically, the number of input units corresponds to the number of features.

For clarity, nodes are labeled to facilitate discussion of the weights between them, such as \(w_{1,5}\) between nodes 1 and 5. Similarly, the output of a node is denoted by \(o_k\), where \(k\) represents the node’s label. For example, for \(k=3\), the output would be \(o_3\).

Forward Pass (Computation)

\(o3 = \sigma(w_{13} x^{(1)}+ w_{23} x^{(2)} + b_3)\)

\(o4 = \sigma(w_{14} x^{(1)}+ w_{24} x^{(2)} + b_4)\)

\(o5 = \sigma(w_{15} x^{(1)}+ w_{25} x^{(2)} + b_5)\)

\(o6 = \sigma(w_{36} o_3 + w_{46} o_4 + w_{56} o_5 + b_6)\)

\(o7 = \sigma(w_{37} o_3 + w_{47} o_4 + w_{57} o_5 + b_7)\)

First, it’s important to understand the information flow: this network computes two outputs from its inputs.

To simplify the figure, I have opted not to display the bias terms, though they remain crucial components. Specifically, \(b_3\) represents the bias term associated with node 3.

If bias terms were not significant, the training process would naturally reduce them to zero. Bias terms are essential as they enable the adjustment of the decision boundary, allowing the model to learn more complex patterns that weights alone cannot capture. By offering additional degrees of freedom, they also contribute to faster convergence during training.

Forward Pass (Computation)

import numpy as np

# Sigmoid function

def sigma(x):
    return 1 / (1 + np.exp(-x))

# Input (two attributes) vector, one example of our trainig set

x1, x2 = (0.5, 0.9)

# Initializing the weights of layers 2 and 3 to random values

w13, w14, w15, w23, w24, w25 = np.random.uniform(low=-1, high=1, size=6)
w36, w46, w56, w37, w47, w57 = np.random.uniform(low=-1, high=1, size=6)

# Initializing all 5 bias terms to random values

b3, b4, b5, b6, b7 = np.random.uniform(low=-1, high=1, size=5)

o3 = sigma(w13 * x1 + w23 * x2 + b3)
o4 = sigma(w14 * x1 + w24 * x2 + b4)
o5 = sigma(w15 * x1 + w25 * x2 + b5)
o6 = sigma(w36 * o3 + w46 * o4 + w56 * o5 + b6)
o7 = sigma(w37 * o3 + w47 * o4 + w57 * o5 + b7)

(o6, o7)
(0.4594883239069456, 0.5771365674383726)

The example above illustrates the computation process with specific values. Before training a neural network, it is standard practice to initialize the weights and biases with random values. Gradient descent is then employed to iteratively adjust these parameters, aiming to minimize the loss function.

Forward Pass (Computatation)

The information flow remains consistent even in more complex networks. Networks with many layers are called deep neural networks (DNN).

Produced using NN-SVG, LeNail (2019).

Forward Pass (Computatation)

Same network with bias terms shown.

Produced using NN-SVG, LeNail (2019).

Activation Function

  • As will be discussed later, the training algorithm, known as backpropagation, employs gradient descent, necessitating the calculation of the partial derivatives of the loss function.

  • The step function in the multilayer perceptron had to be replaced, as it consists only of flat surfaces. Gradient descent cannot progress on flat surfaces due to their zero derivative.

Activation Function

  • Nonlinear activation functions are paramount because, without them, multiple layers in the network would only compute a linear function of the inputs.

  • According to the Universal Approximation Theorem, sufficiently large deep networks with nonlinear activation functions can approximate any continuous function. See Universal Approximation Theorem.

Sigmoid

Code
import numpy as np
import matplotlib.pyplot as plt

# Sigmoid function
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# Generate x values
x = np.linspace(-10, 10, 400)

# Compute y values for the sigmoid function
y = sigmoid(x)

plt.figure(figsize=(4,3))
plt.plot(x, y, color='black', linewidth=2)
plt.grid(True)
plt.show()
plt.show()

\[ \sigma(t) = \frac{1}{1 + e^{-t}} \]

Hyperbolic Tangent Function

Code
# Compute y values for the hyperbolic tangent function

y = np.tanh(x)

plt.figure(figsize=(4,3))
plt.plot(x, y, color='black', linewidth=2)
plt.grid(True)
plt.show()

Hyperbolic tangent (\(\tanh(t) = 2 \sigma(2t) - 1\)) is an S-shaped curve, similar to the sigmoid function, producing output values ranging from -1 to 1. According to Géron (2022), this range helps each layer’s output to be approximately centered around 0 at the start of training, thereby accelerating convergence.

Rectified linear unit function (ReLU)

Code
# Compute y values for the rectified linear unit function (ReLU) function
y = np.maximum(0, x)

plt.figure(figsize=(4,3))
plt.plot(x, y, color='black', linewidth=2)
plt.grid(True)
plt.show()

Although the ReLU function (\(\mathrm{ReLU}(t) = \max(0, t)\)) is not differentiable at \(t=0\) and has a derivative of 0 for \(t<0\), it performs quite well in practice and is computationally efficient. Consequently, it has become the default activation function.

Common Activation Functions

Code
from scipy.special import expit as sigmoid

def relu(z):
    return np.maximum(0, z)

def derivative(f, z, eps=0.000001):
    return (f(z + eps) - f(z - eps))/(2 * eps)

max_z = 4.5
z = np.linspace(-max_z, max_z, 200)

plt.figure(figsize=(11, 3.1))

plt.subplot(121)
plt.plot([-max_z, 0], [0, 0], "r-", linewidth=2, label="Heaviside")
plt.plot(z, relu(z), "m-.", linewidth=2, label="ReLU")
plt.plot([0, 0], [0, 1], "r-", linewidth=0.5)
plt.plot([0, max_z], [1, 1], "r-", linewidth=2)
plt.plot(z, sigmoid(z), "g--", linewidth=2, label="Sigmoid")
plt.plot(z, np.tanh(z), "b-", linewidth=1, label="Tanh")
plt.grid(True)
plt.title("Activation functions")
plt.axis([-max_z, max_z, -1.65, 2.4])
plt.gca().set_yticks([-1, 0, 1, 2])
plt.legend(loc="lower right", fontsize=13)

plt.subplot(122)
plt.plot(z, derivative(np.sign, z), "r-", linewidth=2, label="Heaviside")
plt.plot(0, 0, "ro", markersize=5)
plt.plot(0, 0, "rx", markersize=10)
plt.plot(z, derivative(sigmoid, z), "g--", linewidth=2, label="Sigmoid")
plt.plot(z, derivative(np.tanh, z), "b-", linewidth=1, label="Tanh")
plt.plot([-max_z, 0], [0, 0], "m-.", linewidth=2)
plt.plot([0, max_z], [1, 1], "m-.", linewidth=2)
plt.plot([0, 0], [0, 1], "m-.", linewidth=1.2)
plt.plot(0, 1, "mo", markersize=5)
plt.plot(0, 1, "mx", markersize=10)
plt.grid(True)
plt.title("Derivatives")
plt.axis([-max_z, max_z, -0.2, 1.2])

plt.show()

Universal Approximation

Definition

The Universal Approximation Theorem (UAT) states that a feedforward neural network with a single hidden layer containing a finite number of neurons can approximate any continuous function on a compact subset of \(\mathbb{R}^n\), given appropriate weights and activation functions.

Cybenko (1989); Hornik, Stinchcombe, and White (1989)

In mathematical terms, a subset of \(\mathbb{R}^n\) is considered compact if it is both closed and bounded.

  • Closed: A set is closed if it contains all its boundary points. In other words, it includes its limit points or accumulation points.

  • Bounded: A set is bounded if there exists a real number (M) such that the distance between any two points in the set is less than \(M\).

In the context of the universal approximation theorem, compactness ensures that the function being approximated is defined on a finite and well-behaved region, which is crucial for the theoretical guarantees provided by the theorem.

Single Hidden Layer

\[ y = \sum_{i=1}^N \alpha_i \sigma(w_{1,i} x + b_i) \]

Notation adapted to follow that of Cybenko (1989).

Effect of Varying w

Code
def logistic(x, w, b):
    """Compute the logistic function with parameters w and b."""
    return 1 / (1 + np.exp(-(w * x + b)))

# Define a range for x values.
x = np.linspace(-10, 10, 400)

# Plot 1: Varying w (steepness) with b fixed at 0.
plt.figure(figsize=(6,4))
w_values = [0.5, 1, 2, 5]  # different steepness values
b = 0  # fixed bias

for w in w_values:
    plt.plot(x, logistic(x, w, b), label=f'w = {w}, b = {b}')
plt.title('Effect of Varying w (with b = 0)')
plt.xlabel('x')
plt.ylabel(r'$\sigma(wx+b)$')
plt.legend()
plt.grid(True)

plt.show()

Sigmoid activation function: \(\sigma(wx+b)\).

Effect of Varying b

Code
# Plot 2: Varying b (horizontal shift) with w fixed at 1.
plt.figure(figsize=(6,4))
w = 1  # fixed steepness
b_values = [-5, -2, 0, 2, 5]  # different bias values

for b in b_values:
    plt.plot(x, logistic(x, w, b), label=f'w = {w}, b = {b}')
plt.title('Effect of Varying b (with w = 1)')
plt.xlabel('x')
plt.ylabel(r'$\sigma(wx+b)$')
plt.legend()
plt.grid(True)

plt.show()

Sigmoid activation function: \(\sigma(wx+b)\).

Effect of Varying w

Code
def relu(x, w, b):
    """Compute the ReLU activation with parameters w and b."""
    return np.maximum(0, w * x + b)

# Define a range for x values.
x = np.linspace(-10, 10, 400)

# Plot 1: Varying w (scaling) with b fixed at 0.
plt.figure(figsize=(6,4))
w_values = [0.5, 1, 2, 5]  # different scaling values
b = 0  # fixed bias

for w in w_values:
    plt.plot(x, relu(x, w, b), label=f'w = {w}, b = {b}')
plt.title('Effect of Varying w (with b = 0) on ReLU Activation')
plt.xlabel('x')
plt.ylabel('ReLU(wx+b)')
plt.legend()
plt.grid(True)

plt.show()

ReLU activation function: np.maximum(0, w * x + b).

Effect of Varying b

Code
# Plot 2: Varying b (horizontal shift) with w fixed at 1.
plt.figure(figsize=(6,4))
w = 1  # fixed scaling
b_values = [-5, -2, 0, 2, 5]  # different bias values

for b in b_values:
    plt.plot(x, relu(x, w, b), label=f'w = {w}, b = {b}')
plt.title('Effect of Varying b (with w = 1) on ReLU Activation')
plt.xlabel('x')
plt.ylabel('ReLU(wx+b)')
plt.legend()
plt.grid(True)

plt.show()

ReLU activation function: np.maximum(0, w * x + b).

Single Hidden Layer

\[ y = \sum_{i=1}^N \alpha_i \sigma(w_{1,i} x + b_i) \]

Demonstration with Code

# Defining the function to be approximated

def f(x):
  return 2 * x**3 + 4 * x**2 - 5 * x + 1

# Generating a dataset, x in [-4,2), f(x) as above

X = 6 * np.random.rand(1000, 1) - 4

y = f(X).flatten()

Increasing the Number of Neurons

from sklearn.neural_network import MLPRegressor
from sklearn.model_selection import train_test_split

X_train, X_valid, y_train, y_valid = train_test_split(X, y, test_size=0.1, random_state=42)

models = []

sizes = [1, 2, 5, 10, 100]

for i, n in enumerate(sizes):

  models.append(MLPRegressor(hidden_layer_sizes=[n], max_iter=5000, random_state=42))

  models[i].fit(X_train, y_train) 

MLPRegressor is a multi-layer perceptron regressor from sklearn. Its default activation function is relu.

Increasing the Number of Neurons

Code
# Create a colormap
colors = plt.colormaps['cool'].resampled(len(sizes))

X_valid = np.sort(X_valid,axis=0)

for i, n in enumerate(sizes):

  y_pred = models[i].predict(X_valid)

  plt.plot(X_valid, y_pred, "-", color=colors(i), label="Number of neurons = {}".format(n))

y_true = f(X_valid)
plt.plot(X_valid, y_true, "r.", label='Actual')

plt.legend()
plt.show()

In the example above, I retained only 10% of the data as the test set because the function being approximated is straightforward and noise-free. This decision was made to ensure that the true curve does not overshadow the other results.

Increasing the Number of Neurons

Code
for i, n in enumerate(sizes):

  plt.plot(models[i].loss_curve_, "-", color=colors(i), label="Number of neurons = {}".format(n))

plt.title('MLPRegressor Loss Curves')
plt.xlabel('Iterations')
plt.ylabel('Loss')

plt.legend()
plt.show()

As expected, increasing neuron count reduces loss.

Universal Approximation

This video effectively conveys the underlying intuition of the universal approximation theorem. (18m 53s)

The video effectively elucidates key concepts (terminology) in neural networks, including nodes, layers, weights, and activation functions. It demonstrates the process of summing activation outputs from a preceding layer, akin to the aggregation of curves. Additionally, the video illustrates how scaling an output by a weight not only alters the amplitude of a curve but also inverts its orientation when the weight is negative. Moreover, it clearly depicts the function of bias terms in vertically shifting the curve, contingent on the sign of the bias.

Let’s Code

Frameworks

PyTorch and TensorFlow are the leading platforms for deep learning.

  • PyTorch has gained considerable traction in the research community. Initially developed by Meta AI, it is now part of the Linux Foundation.

  • TensorFlow, created by Google, is widely adopted in industry for deploying models in production environments.

Keras

Keras is a high-level API designed to build, train, evaluate, and execute models across various backends, including PyTorch, TensorFlow, and JAX, Google’s high-performance platform.

Keras is powerful enough for most projects.

As highlighted in previous Quotes of the Day, François Chollet, a Google engineer, is the originator and one of the primary developers of the Keras project.

Fashion-MNIST dataset

Fashion-MNIST is a dataset of Zalando’s article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.”

Attribution: Géron (2022)10_neural_nets_with_keras.ipynb

Loading

import tensorflow as tf

fashion_mnist = tf.keras.datasets.fashion_mnist.load_data()

(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist

X_train, y_train = X_train_full[:-5000], y_train_full[:-5000]
X_valid, y_valid = X_train_full[-5000:], y_train_full[-5000:]

Setting aside 5000 examples as a validation set.

Exploration

X_train.shape
(55000, 28, 28)

. . .

X_train.dtype
dtype('uint8')

. . .

Transforming the pixel intensities from integers in the range 0 to 255 to floats in the range 0 to 1.

X_train, X_valid, X_test = X_train / 255., X_valid / 255., X_test / 255.

What are these Images Anyway?

plt.figure(figsize=(2, 2))
plt.imshow(X_train[0], cmap="binary")
plt.axis('off')
plt.show()

. . .

y_train
array([9, 0, 0, ..., 9, 0, 2], dtype=uint8)

. . .

Since the labels are integers, 0 to 9. Class names will become handy.

class_names = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat",
               "Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]

First 40 Images

n_rows = 4
n_cols = 10
plt.figure(figsize=(n_cols * 1.2, n_rows * 1.2))
for row in range(n_rows):
    for col in range(n_cols):
        index = n_cols * row + col
        plt.subplot(n_rows, n_cols, index + 1)
        plt.imshow(X_train[index], cmap="binary", interpolation="nearest")
        plt.axis('off')
        plt.title(class_names[y_train[index]])
plt.subplots_adjust(wspace=0.2, hspace=0.5)
plt.show()

Creating a Model

tf.random.set_seed(42)

model = tf.keras.Sequential()

model.add(tf.keras.layers.InputLayer(shape=[28, 28]))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(300, activation="relu"))
model.add(tf.keras.layers.Dense(100, activation="relu"))
model.add(tf.keras.layers.Dense(10, activation="softmax"))

model.summary()

Code
model.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                     Output Shape                  Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ flatten (Flatten)               │ (None, 784)            │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense)                   │ (None, 300)            │       235,500 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 100)            │        30,100 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 10)             │         1,010 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 266,610 (1.02 MB)
 Trainable params: 266,610 (1.02 MB)
 Non-trainable params: 0 (0.00 B)

As observed, dense_3 has \(235,500\) parameters, while \(784 \times 300 = 235,200\).

Could you explain the origin of the additional parameters?

Similarly, dense_3 has \(30,100\) parameters, while \(300 \times 100 = 30,000\).

Can you explain why?

Creating a Model (Alternative)

Code
# extra code – clear the session to reset the name counters
tf.keras.backend.clear_session()
tf.random.set_seed(42)
model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=[28, 28]),
    tf.keras.layers.Dense(300, activation="relu"),
    tf.keras.layers.Dense(100, activation="relu"),
    tf.keras.layers.Dense(10, activation="softmax")
])

model.summary()

Code
model.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                     Output Shape                  Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ flatten (Flatten)               │ (None, 784)            │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense)                   │ (None, 300)            │       235,500 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 100)            │        30,100 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 10)             │         1,010 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 266,610 (1.02 MB)
 Trainable params: 266,610 (1.02 MB)
 Non-trainable params: 0 (0.00 B)

Compiling the Model

model.compile(loss="sparse_categorical_crossentropy",
              optimizer="sgd",
              metrics=["accuracy"])

sparse_categorical_crossentropy is the appropriate function for a multiclass classification problem (more later).

The method compile allows to set the loss function, as well as other parameters. Keras then prepares the model for training.

Training the Model

history = model.fit(X_train, y_train, epochs=30,
                    validation_data=(X_valid, y_valid))
Epoch 1/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 3:52 135ms/step - accuracy: 0.0625 - loss: 2.4552  58/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 878us/step - accuracy: 0.2639 - loss: 2.0859   118/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 855us/step - accuracy: 0.3725 - loss: 1.8932 180/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 842us/step - accuracy: 0.4346 - loss: 1.7513 243/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 830us/step - accuracy: 0.4771 - loss: 1.6418 304/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 829us/step - accuracy: 0.5066 - loss: 1.5582 368/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 822us/step - accuracy: 0.5307 - loss: 1.4863 429/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 823us/step - accuracy: 0.5491 - loss: 1.4293 490/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 824us/step - accuracy: 0.5646 - loss: 1.3806 552/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 823us/step - accuracy: 0.5780 - loss: 1.3377 615/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 821us/step - accuracy: 0.5898 - loss: 1.2996 676/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 821us/step - accuracy: 0.5999 - loss: 1.2667 738/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 821us/step - accuracy: 0.6091 - loss: 1.2368 801/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 819us/step - accuracy: 0.6174 - loss: 1.2093 860/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 822us/step - accuracy: 0.6246 - loss: 1.1858 924/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 820us/step - accuracy: 0.6317 - loss: 1.1625 988/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 818us/step - accuracy: 0.6382 - loss: 1.14091018/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 843us/step - accuracy: 0.6411 - loss: 1.13131071/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 848us/step - accuracy: 0.6459 - loss: 1.11541117/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 858us/step - accuracy: 0.6498 - loss: 1.10231166/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 866us/step - accuracy: 0.6537 - loss: 1.08911210/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 877us/step - accuracy: 0.6571 - loss: 1.07791251/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 888us/step - accuracy: 0.6601 - loss: 1.06791267/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 917us/step - accuracy: 0.6612 - loss: 1.06421324/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 916us/step - accuracy: 0.6650 - loss: 1.05121385/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 911us/step - accuracy: 0.6690 - loss: 1.03801447/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 907us/step - accuracy: 0.6727 - loss: 1.02541508/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 904us/step - accuracy: 0.6762 - loss: 1.01361568/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 902us/step - accuracy: 0.6795 - loss: 1.00261620/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 904us/step - accuracy: 0.6822 - loss: 0.99351673/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 905us/step - accuracy: 0.6848 - loss: 0.98471719/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 910us/step - accuracy: 0.6870 - loss: 0.97731719/1719 ━━━━━━━━━━━━━━━━━━━━ 2s 1ms/step - accuracy: 0.6870 - loss: 0.9771 - val_accuracy: 0.8280 - val_loss: 0.5050
Epoch 2/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 27s 16ms/step - accuracy: 0.8438 - loss: 0.5558  50/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.8358 - loss: 0.4935    95/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.8300 - loss: 0.5032 148/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.8243 - loss: 0.5140 203/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.8224 - loss: 0.5169 252/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.8220 - loss: 0.5175 306/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 999us/step - accuracy: 0.8216 - loss: 0.5181 358/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 993us/step - accuracy: 0.8214 - loss: 0.5178 411/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 987us/step - accuracy: 0.8212 - loss: 0.5176 463/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 984us/step - accuracy: 0.8210 - loss: 0.5172 488/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.8209 - loss: 0.5170   536/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.8209 - loss: 0.5166 586/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.8209 - loss: 0.5161 641/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.8211 - loss: 0.5155 691/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.8213 - loss: 0.5150 749/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.8215 - loss: 0.5143 808/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.8218 - loss: 0.5136 870/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 992us/step - accuracy: 0.8220 - loss: 0.5129 931/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 981us/step - accuracy: 0.8223 - loss: 0.5122 993/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 971us/step - accuracy: 0.8225 - loss: 0.51131055/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 962us/step - accuracy: 0.8228 - loss: 0.51041118/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 953us/step - accuracy: 0.8231 - loss: 0.50961180/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 945us/step - accuracy: 0.8234 - loss: 0.50871239/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 941us/step - accuracy: 0.8236 - loss: 0.50801297/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 938us/step - accuracy: 0.8238 - loss: 0.50741355/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 935us/step - accuracy: 0.8240 - loss: 0.50671417/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 930us/step - accuracy: 0.8243 - loss: 0.50601478/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 925us/step - accuracy: 0.8245 - loss: 0.50531536/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 923us/step - accuracy: 0.8248 - loss: 0.50461594/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 921us/step - accuracy: 0.8250 - loss: 0.50391654/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 918us/step - accuracy: 0.8252 - loss: 0.50331715/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 915us/step - accuracy: 0.8254 - loss: 0.50261719/1719 ━━━━━━━━━━━━━━━━━━━━ 2s 977us/step - accuracy: 0.8254 - loss: 0.5026 - val_accuracy: 0.8410 - val_loss: 0.4554
Epoch 3/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 21s 12ms/step - accuracy: 0.8438 - loss: 0.4875  60/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 854us/step - accuracy: 0.8557 - loss: 0.4315 120/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 844us/step - accuracy: 0.8492 - loss: 0.4453 182/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 836us/step - accuracy: 0.8453 - loss: 0.4537 241/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 840us/step - accuracy: 0.8443 - loss: 0.4555 302/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 837us/step - accuracy: 0.8435 - loss: 0.4568 361/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 841us/step - accuracy: 0.8431 - loss: 0.4571 417/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 848us/step - accuracy: 0.8427 - loss: 0.4575 475/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 851us/step - accuracy: 0.8424 - loss: 0.4577 531/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 856us/step - accuracy: 0.8422 - loss: 0.4577 589/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 857us/step - accuracy: 0.8421 - loss: 0.4576 647/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 858us/step - accuracy: 0.8421 - loss: 0.4573 707/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 856us/step - accuracy: 0.8422 - loss: 0.4570 769/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 853us/step - accuracy: 0.8423 - loss: 0.4567 829/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 852us/step - accuracy: 0.8424 - loss: 0.4564 890/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 850us/step - accuracy: 0.8424 - loss: 0.4562 951/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 849us/step - accuracy: 0.8425 - loss: 0.45581011/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 848us/step - accuracy: 0.8427 - loss: 0.45531073/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 846us/step - accuracy: 0.8428 - loss: 0.45481134/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 845us/step - accuracy: 0.8430 - loss: 0.45431193/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 845us/step - accuracy: 0.8431 - loss: 0.45381253/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 845us/step - accuracy: 0.8432 - loss: 0.45351318/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 842us/step - accuracy: 0.8433 - loss: 0.45301381/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 840us/step - accuracy: 0.8434 - loss: 0.45261444/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 838us/step - accuracy: 0.8435 - loss: 0.45221498/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 842us/step - accuracy: 0.8436 - loss: 0.45181559/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 841us/step - accuracy: 0.8437 - loss: 0.45141623/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 839us/step - accuracy: 0.8439 - loss: 0.45101684/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 839us/step - accuracy: 0.8440 - loss: 0.45061719/1719 ━━━━━━━━━━━━━━━━━━━━ 2s 899us/step - accuracy: 0.8440 - loss: 0.4504 - val_accuracy: 0.8458 - val_loss: 0.4314
Epoch 4/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 20s 12ms/step - accuracy: 0.8438 - loss: 0.4453  60/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 849us/step - accuracy: 0.8634 - loss: 0.3970 120/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 846us/step - accuracy: 0.8576 - loss: 0.4112 182/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 836us/step - accuracy: 0.8543 - loss: 0.4204 244/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 831us/step - accuracy: 0.8534 - loss: 0.4228 305/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 831us/step - accuracy: 0.8526 - loss: 0.4243 366/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 830us/step - accuracy: 0.8522 - loss: 0.4249 427/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 829us/step - accuracy: 0.8517 - loss: 0.4256 490/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 826us/step - accuracy: 0.8514 - loss: 0.4259 552/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 825us/step - accuracy: 0.8512 - loss: 0.4261 613/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 825us/step - accuracy: 0.8512 - loss: 0.4260 676/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 823us/step - accuracy: 0.8513 - loss: 0.4258 737/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 823us/step - accuracy: 0.8514 - loss: 0.4256 797/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 824us/step - accuracy: 0.8515 - loss: 0.4254 858/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 824us/step - accuracy: 0.8516 - loss: 0.4253 921/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 822us/step - accuracy: 0.8517 - loss: 0.4251 983/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 822us/step - accuracy: 0.8519 - loss: 0.42471047/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 820us/step - accuracy: 0.8520 - loss: 0.42431109/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 820us/step - accuracy: 0.8522 - loss: 0.42391172/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 818us/step - accuracy: 0.8523 - loss: 0.42361235/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 817us/step - accuracy: 0.8524 - loss: 0.42321299/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 816us/step - accuracy: 0.8525 - loss: 0.42291362/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 815us/step - accuracy: 0.8526 - loss: 0.42261425/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 815us/step - accuracy: 0.8527 - loss: 0.42231489/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 814us/step - accuracy: 0.8529 - loss: 0.42191553/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 812us/step - accuracy: 0.8530 - loss: 0.42161615/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 812us/step - accuracy: 0.8531 - loss: 0.42131679/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 812us/step - accuracy: 0.8532 - loss: 0.42101719/1719 ━━━━━━━━━━━━━━━━━━━━ 2s 870us/step - accuracy: 0.8532 - loss: 0.4208 - val_accuracy: 0.8512 - val_loss: 0.4151
Epoch 5/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 20s 12ms/step - accuracy: 0.8438 - loss: 0.4056  61/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 837us/step - accuracy: 0.8703 - loss: 0.3729 123/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 823us/step - accuracy: 0.8660 - loss: 0.3881 185/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 819us/step - accuracy: 0.8630 - loss: 0.3972 247/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 817us/step - accuracy: 0.8622 - loss: 0.3999 309/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 815us/step - accuracy: 0.8614 - loss: 0.4015 372/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 813us/step - accuracy: 0.8610 - loss: 0.4023 432/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 817us/step - accuracy: 0.8606 - loss: 0.4031 492/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 819us/step - accuracy: 0.8603 - loss: 0.4036 555/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 817us/step - accuracy: 0.8601 - loss: 0.4038 619/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 814us/step - accuracy: 0.8601 - loss: 0.4037 682/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 812us/step - accuracy: 0.8601 - loss: 0.4036 745/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.8601 - loss: 0.4035 808/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 810us/step - accuracy: 0.8602 - loss: 0.4034 871/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 809us/step - accuracy: 0.8603 - loss: 0.4034 934/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 808us/step - accuracy: 0.8604 - loss: 0.4032 998/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 807us/step - accuracy: 0.8605 - loss: 0.40291061/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 807us/step - accuracy: 0.8605 - loss: 0.40261124/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 806us/step - accuracy: 0.8606 - loss: 0.40231186/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 807us/step - accuracy: 0.8607 - loss: 0.40201247/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 808us/step - accuracy: 0.8608 - loss: 0.40181312/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 806us/step - accuracy: 0.8608 - loss: 0.40151376/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.8609 - loss: 0.40121438/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.8610 - loss: 0.40091501/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.8611 - loss: 0.40071564/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.8611 - loss: 0.40041627/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.8612 - loss: 0.40011690/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.8613 - loss: 0.39991719/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 863us/step - accuracy: 0.8613 - loss: 0.3998 - val_accuracy: 0.8536 - val_loss: 0.4039
Epoch 6/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 20s 12ms/step - accuracy: 0.8750 - loss: 0.3781  64/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 794us/step - accuracy: 0.8704 - loss: 0.3550 129/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 785us/step - accuracy: 0.8663 - loss: 0.3709 193/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 787us/step - accuracy: 0.8642 - loss: 0.3795 258/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 785us/step - accuracy: 0.8637 - loss: 0.3822 322/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 784us/step - accuracy: 0.8632 - loss: 0.3839 385/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 786us/step - accuracy: 0.8631 - loss: 0.3849 449/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 785us/step - accuracy: 0.8630 - loss: 0.3857 513/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 785us/step - accuracy: 0.8629 - loss: 0.3862 577/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 785us/step - accuracy: 0.8630 - loss: 0.3864 641/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 785us/step - accuracy: 0.8632 - loss: 0.3863 705/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 785us/step - accuracy: 0.8634 - loss: 0.3863 771/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 783us/step - accuracy: 0.8636 - loss: 0.3862 834/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 785us/step - accuracy: 0.8637 - loss: 0.3862 898/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 785us/step - accuracy: 0.8639 - loss: 0.3862 962/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 785us/step - accuracy: 0.8640 - loss: 0.38601026/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 785us/step - accuracy: 0.8642 - loss: 0.38581090/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 785us/step - accuracy: 0.8644 - loss: 0.38551154/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 785us/step - accuracy: 0.8645 - loss: 0.38521218/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 785us/step - accuracy: 0.8646 - loss: 0.38491282/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 785us/step - accuracy: 0.8647 - loss: 0.38481346/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 785us/step - accuracy: 0.8648 - loss: 0.38451412/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 784us/step - accuracy: 0.8649 - loss: 0.38421479/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 783us/step - accuracy: 0.8651 - loss: 0.38401545/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 782us/step - accuracy: 0.8652 - loss: 0.38371609/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 782us/step - accuracy: 0.8653 - loss: 0.38351672/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 783us/step - accuracy: 0.8654 - loss: 0.38331719/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 844us/step - accuracy: 0.8654 - loss: 0.3831 - val_accuracy: 0.8558 - val_loss: 0.3943
Epoch 7/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 21s 12ms/step - accuracy: 0.8750 - loss: 0.3559  57/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 901us/step - accuracy: 0.8743 - loss: 0.3379 113/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 898us/step - accuracy: 0.8712 - loss: 0.3519 171/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 888us/step - accuracy: 0.8686 - loss: 0.3626 228/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 886us/step - accuracy: 0.8678 - loss: 0.3661 287/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 880us/step - accuracy: 0.8675 - loss: 0.3681 345/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 879us/step - accuracy: 0.8674 - loss: 0.3692 404/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 875us/step - accuracy: 0.8674 - loss: 0.3702 462/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 874us/step - accuracy: 0.8673 - loss: 0.3709 522/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 870us/step - accuracy: 0.8673 - loss: 0.3713 581/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 868us/step - accuracy: 0.8675 - loss: 0.3715 642/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 864us/step - accuracy: 0.8677 - loss: 0.3715 703/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 860us/step - accuracy: 0.8679 - loss: 0.3714 765/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 857us/step - accuracy: 0.8680 - loss: 0.3714 828/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 852us/step - accuracy: 0.8682 - loss: 0.3714 889/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 850us/step - accuracy: 0.8683 - loss: 0.3715 950/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 849us/step - accuracy: 0.8685 - loss: 0.37141009/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 850us/step - accuracy: 0.8686 - loss: 0.37121068/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 850us/step - accuracy: 0.8687 - loss: 0.37101128/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 849us/step - accuracy: 0.8688 - loss: 0.37071189/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 848us/step - accuracy: 0.8689 - loss: 0.37051251/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 846us/step - accuracy: 0.8690 - loss: 0.37031314/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 844us/step - accuracy: 0.8691 - loss: 0.37011376/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 842us/step - accuracy: 0.8692 - loss: 0.36991439/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 841us/step - accuracy: 0.8693 - loss: 0.36971501/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 840us/step - accuracy: 0.8694 - loss: 0.36951563/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 839us/step - accuracy: 0.8695 - loss: 0.36931623/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 838us/step - accuracy: 0.8696 - loss: 0.36911684/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 838us/step - accuracy: 0.8697 - loss: 0.36891719/1719 ━━━━━━━━━━━━━━━━━━━━ 2s 896us/step - accuracy: 0.8697 - loss: 0.3688 - val_accuracy: 0.8568 - val_loss: 0.3855
Epoch 8/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 20s 12ms/step - accuracy: 0.8750 - loss: 0.3360  62/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 822us/step - accuracy: 0.8807 - loss: 0.3272 125/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 812us/step - accuracy: 0.8759 - loss: 0.3420 190/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 801us/step - accuracy: 0.8731 - loss: 0.3511 256/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 793us/step - accuracy: 0.8723 - loss: 0.3540 323/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 785us/step - accuracy: 0.8717 - loss: 0.3558 386/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 788us/step - accuracy: 0.8717 - loss: 0.3570 450/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 788us/step - accuracy: 0.8716 - loss: 0.3579 512/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.8716 - loss: 0.3584 575/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 791us/step - accuracy: 0.8716 - loss: 0.3587 639/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.8718 - loss: 0.3587 702/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 791us/step - accuracy: 0.8720 - loss: 0.3586 766/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 791us/step - accuracy: 0.8722 - loss: 0.3587 832/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 789us/step - accuracy: 0.8723 - loss: 0.3587 896/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.8725 - loss: 0.3588 960/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 789us/step - accuracy: 0.8726 - loss: 0.35871023/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.8728 - loss: 0.35851087/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.8729 - loss: 0.35831151/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.8730 - loss: 0.35801214/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 791us/step - accuracy: 0.8731 - loss: 0.35781278/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.8732 - loss: 0.35771342/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.8733 - loss: 0.35751405/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 791us/step - accuracy: 0.8734 - loss: 0.35731469/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.8735 - loss: 0.35711534/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.8736 - loss: 0.35691598/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.8737 - loss: 0.35671658/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 791us/step - accuracy: 0.8737 - loss: 0.35651719/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 793us/step - accuracy: 0.8738 - loss: 0.35641719/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 853us/step - accuracy: 0.8738 - loss: 0.3564 - val_accuracy: 0.8600 - val_loss: 0.3790
Epoch 9/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 19s 12ms/step - accuracy: 0.8750 - loss: 0.3297  64/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 798us/step - accuracy: 0.8863 - loss: 0.3164 129/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 785us/step - accuracy: 0.8800 - loss: 0.3313 194/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 783us/step - accuracy: 0.8770 - loss: 0.3397 257/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 788us/step - accuracy: 0.8759 - loss: 0.3425 321/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 787us/step - accuracy: 0.8753 - loss: 0.3443 384/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 788us/step - accuracy: 0.8751 - loss: 0.3454 447/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 791us/step - accuracy: 0.8749 - loss: 0.3464 510/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 791us/step - accuracy: 0.8748 - loss: 0.3470 574/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 791us/step - accuracy: 0.8749 - loss: 0.3473 637/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 792us/step - accuracy: 0.8750 - loss: 0.3473 701/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 792us/step - accuracy: 0.8752 - loss: 0.3473 765/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 792us/step - accuracy: 0.8753 - loss: 0.3473 829/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 792us/step - accuracy: 0.8755 - loss: 0.3473 892/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 792us/step - accuracy: 0.8756 - loss: 0.3474 956/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 792us/step - accuracy: 0.8757 - loss: 0.34741020/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 792us/step - accuracy: 0.8758 - loss: 0.34721083/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 793us/step - accuracy: 0.8760 - loss: 0.34701146/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 793us/step - accuracy: 0.8761 - loss: 0.34681210/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 793us/step - accuracy: 0.8762 - loss: 0.34661271/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 794us/step - accuracy: 0.8763 - loss: 0.34651334/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 794us/step - accuracy: 0.8764 - loss: 0.34631397/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 794us/step - accuracy: 0.8765 - loss: 0.34611460/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 795us/step - accuracy: 0.8766 - loss: 0.34591522/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 795us/step - accuracy: 0.8767 - loss: 0.34581585/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 796us/step - accuracy: 0.8767 - loss: 0.34561648/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 796us/step - accuracy: 0.8768 - loss: 0.34541711/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 796us/step - accuracy: 0.8768 - loss: 0.34531719/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 856us/step - accuracy: 0.8768 - loss: 0.3453 - val_accuracy: 0.8608 - val_loss: 0.3734
Epoch 10/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 20s 12ms/step - accuracy: 0.8750 - loss: 0.3246  64/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 805us/step - accuracy: 0.8897 - loss: 0.3067 127/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 803us/step - accuracy: 0.8840 - loss: 0.3207 192/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 796us/step - accuracy: 0.8807 - loss: 0.3293 252/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 806us/step - accuracy: 0.8796 - loss: 0.3320 312/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 813us/step - accuracy: 0.8788 - loss: 0.3338 370/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 823us/step - accuracy: 0.8786 - loss: 0.3349 428/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 830us/step - accuracy: 0.8783 - loss: 0.3359 487/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 833us/step - accuracy: 0.8783 - loss: 0.3366 549/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 831us/step - accuracy: 0.8782 - loss: 0.3370 609/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 832us/step - accuracy: 0.8783 - loss: 0.3371 668/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 834us/step - accuracy: 0.8785 - loss: 0.3370 725/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 838us/step - accuracy: 0.8786 - loss: 0.3370 783/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 840us/step - accuracy: 0.8787 - loss: 0.3371 843/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 840us/step - accuracy: 0.8788 - loss: 0.3371 904/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 839us/step - accuracy: 0.8789 - loss: 0.3372 966/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 837us/step - accuracy: 0.8791 - loss: 0.33711026/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 837us/step - accuracy: 0.8792 - loss: 0.33701088/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 835us/step - accuracy: 0.8793 - loss: 0.33681148/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 838us/step - accuracy: 0.8794 - loss: 0.33661181/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 857us/step - accuracy: 0.8795 - loss: 0.33651242/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 856us/step - accuracy: 0.8796 - loss: 0.33641304/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 854us/step - accuracy: 0.8796 - loss: 0.33631368/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 851us/step - accuracy: 0.8798 - loss: 0.33611429/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 850us/step - accuracy: 0.8799 - loss: 0.33591488/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 850us/step - accuracy: 0.8799 - loss: 0.33581546/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 850us/step - accuracy: 0.8800 - loss: 0.33561607/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 850us/step - accuracy: 0.8801 - loss: 0.33551669/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 848us/step - accuracy: 0.8801 - loss: 0.33541719/1719 ━━━━━━━━━━━━━━━━━━━━ 2s 906us/step - accuracy: 0.8802 - loss: 0.3353 - val_accuracy: 0.8610 - val_loss: 0.3681
Epoch 11/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 20s 12ms/step - accuracy: 0.8750 - loss: 0.3155  63/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 811us/step - accuracy: 0.8928 - loss: 0.2978 125/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 810us/step - accuracy: 0.8880 - loss: 0.3113 188/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 807us/step - accuracy: 0.8848 - loss: 0.3199 252/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 801us/step - accuracy: 0.8836 - loss: 0.3229 315/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 801us/step - accuracy: 0.8828 - loss: 0.3247 377/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 802us/step - accuracy: 0.8825 - loss: 0.3258 439/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 803us/step - accuracy: 0.8822 - loss: 0.3269 503/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 802us/step - accuracy: 0.8821 - loss: 0.3275 565/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 803us/step - accuracy: 0.8820 - loss: 0.3278 628/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 803us/step - accuracy: 0.8821 - loss: 0.3279 692/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 802us/step - accuracy: 0.8822 - loss: 0.3278 750/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 807us/step - accuracy: 0.8822 - loss: 0.3278 809/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.8823 - loss: 0.3279 871/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 812us/step - accuracy: 0.8824 - loss: 0.3280 934/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.8824 - loss: 0.3280 995/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 812us/step - accuracy: 0.8825 - loss: 0.32791056/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 813us/step - accuracy: 0.8826 - loss: 0.32781114/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 815us/step - accuracy: 0.8827 - loss: 0.32761174/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 817us/step - accuracy: 0.8827 - loss: 0.32741233/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 818us/step - accuracy: 0.8828 - loss: 0.32731293/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 819us/step - accuracy: 0.8829 - loss: 0.32721352/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 820us/step - accuracy: 0.8829 - loss: 0.32701413/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 821us/step - accuracy: 0.8830 - loss: 0.32691476/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 820us/step - accuracy: 0.8831 - loss: 0.32671539/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 819us/step - accuracy: 0.8831 - loss: 0.32661603/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 818us/step - accuracy: 0.8832 - loss: 0.32641666/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 817us/step - accuracy: 0.8832 - loss: 0.32631719/1719 ━━━━━━━━━━━━━━━━━━━━ 2s 875us/step - accuracy: 0.8832 - loss: 0.3262 - val_accuracy: 0.8626 - val_loss: 0.3633
Epoch 12/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 20s 12ms/step - accuracy: 0.9062 - loss: 0.3062  65/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 786us/step - accuracy: 0.8985 - loss: 0.2894 128/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 793us/step - accuracy: 0.8927 - loss: 0.3028 192/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 789us/step - accuracy: 0.8891 - loss: 0.3111 253/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 798us/step - accuracy: 0.8878 - loss: 0.3140 313/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 805us/step - accuracy: 0.8870 - loss: 0.3158 372/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 812us/step - accuracy: 0.8865 - loss: 0.3169 433/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 814us/step - accuracy: 0.8861 - loss: 0.3180 493/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 817us/step - accuracy: 0.8858 - loss: 0.3187 554/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 818us/step - accuracy: 0.8856 - loss: 0.3191 617/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 815us/step - accuracy: 0.8856 - loss: 0.3191 678/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 816us/step - accuracy: 0.8855 - loss: 0.3191 742/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 813us/step - accuracy: 0.8855 - loss: 0.3191 806/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.8855 - loss: 0.3192 867/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 812us/step - accuracy: 0.8855 - loss: 0.3193 930/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.8855 - loss: 0.3193 992/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.8855 - loss: 0.31931054/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.8856 - loss: 0.31911117/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 810us/step - accuracy: 0.8856 - loss: 0.31901180/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 810us/step - accuracy: 0.8857 - loss: 0.31881243/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 810us/step - accuracy: 0.8858 - loss: 0.31871306/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 809us/step - accuracy: 0.8858 - loss: 0.31861370/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 808us/step - accuracy: 0.8859 - loss: 0.31841433/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 807us/step - accuracy: 0.8860 - loss: 0.31831498/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 806us/step - accuracy: 0.8860 - loss: 0.31811562/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 806us/step - accuracy: 0.8861 - loss: 0.31801627/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.8861 - loss: 0.31791691/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 804us/step - accuracy: 0.8861 - loss: 0.31781719/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 862us/step - accuracy: 0.8861 - loss: 0.3177 - val_accuracy: 0.8638 - val_loss: 0.3601
Epoch 13/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 19s 11ms/step - accuracy: 0.8750 - loss: 0.3043  61/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 833us/step - accuracy: 0.8983 - loss: 0.2815 123/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 823us/step - accuracy: 0.8941 - loss: 0.2944 184/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 825us/step - accuracy: 0.8909 - loss: 0.3029 246/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 822us/step - accuracy: 0.8897 - loss: 0.3060 310/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 815us/step - accuracy: 0.8891 - loss: 0.3079 376/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 806us/step - accuracy: 0.8888 - loss: 0.3091 441/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 803us/step - accuracy: 0.8885 - loss: 0.3102 507/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 798us/step - accuracy: 0.8884 - loss: 0.3109 570/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 798us/step - accuracy: 0.8883 - loss: 0.3112 634/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 797us/step - accuracy: 0.8883 - loss: 0.3112 699/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 795us/step - accuracy: 0.8884 - loss: 0.3112 764/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 793us/step - accuracy: 0.8884 - loss: 0.3112 830/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 791us/step - accuracy: 0.8884 - loss: 0.3113 894/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.8884 - loss: 0.3115 958/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.8884 - loss: 0.31151022/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.8885 - loss: 0.31141085/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.8885 - loss: 0.31121149/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.8886 - loss: 0.31101214/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 789us/step - accuracy: 0.8886 - loss: 0.31091279/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 788us/step - accuracy: 0.8887 - loss: 0.31081343/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 789us/step - accuracy: 0.8887 - loss: 0.31071407/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 788us/step - accuracy: 0.8888 - loss: 0.31061470/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 789us/step - accuracy: 0.8889 - loss: 0.31041535/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 788us/step - accuracy: 0.8889 - loss: 0.31031599/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 788us/step - accuracy: 0.8889 - loss: 0.31021666/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 787us/step - accuracy: 0.8890 - loss: 0.31011719/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 847us/step - accuracy: 0.8890 - loss: 0.3100 - val_accuracy: 0.8640 - val_loss: 0.3576
Epoch 14/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 20s 12ms/step - accuracy: 0.8750 - loss: 0.2983  62/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 832us/step - accuracy: 0.8986 - loss: 0.2748 125/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 816us/step - accuracy: 0.8948 - loss: 0.2874 188/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 811us/step - accuracy: 0.8922 - loss: 0.2957 252/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 805us/step - accuracy: 0.8913 - loss: 0.2987 315/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 803us/step - accuracy: 0.8908 - loss: 0.3004 378/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 802us/step - accuracy: 0.8906 - loss: 0.3016 441/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 801us/step - accuracy: 0.8904 - loss: 0.3027 503/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 802us/step - accuracy: 0.8903 - loss: 0.3033 567/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 801us/step - accuracy: 0.8903 - loss: 0.3036 630/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 800us/step - accuracy: 0.8903 - loss: 0.3037 694/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 799us/step - accuracy: 0.8903 - loss: 0.3037 757/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 800us/step - accuracy: 0.8903 - loss: 0.3037 821/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 799us/step - accuracy: 0.8903 - loss: 0.3038 885/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 798us/step - accuracy: 0.8903 - loss: 0.3039 948/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 799us/step - accuracy: 0.8904 - loss: 0.30401011/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 799us/step - accuracy: 0.8904 - loss: 0.30391074/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 799us/step - accuracy: 0.8905 - loss: 0.30381137/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 799us/step - accuracy: 0.8906 - loss: 0.30361201/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 799us/step - accuracy: 0.8907 - loss: 0.30351263/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 799us/step - accuracy: 0.8907 - loss: 0.30341325/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 800us/step - accuracy: 0.8908 - loss: 0.30331387/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 800us/step - accuracy: 0.8909 - loss: 0.30321450/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 800us/step - accuracy: 0.8910 - loss: 0.30301514/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 799us/step - accuracy: 0.8910 - loss: 0.30291577/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 799us/step - accuracy: 0.8911 - loss: 0.30281638/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 800us/step - accuracy: 0.8911 - loss: 0.30271700/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 801us/step - accuracy: 0.8911 - loss: 0.30271719/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 865us/step - accuracy: 0.8911 - loss: 0.3026 - val_accuracy: 0.8650 - val_loss: 0.3559
Epoch 15/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 21s 13ms/step - accuracy: 0.8750 - loss: 0.2953  59/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 877us/step - accuracy: 0.8994 - loss: 0.2685 117/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 875us/step - accuracy: 0.8964 - loss: 0.2796 177/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 861us/step - accuracy: 0.8939 - loss: 0.2886 239/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 849us/step - accuracy: 0.8930 - loss: 0.2918 302/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 838us/step - accuracy: 0.8926 - loss: 0.2936 365/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 831us/step - accuracy: 0.8925 - loss: 0.2947 426/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 830us/step - accuracy: 0.8923 - loss: 0.2958 488/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 827us/step - accuracy: 0.8923 - loss: 0.2964 548/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 828us/step - accuracy: 0.8923 - loss: 0.2968 609/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 828us/step - accuracy: 0.8923 - loss: 0.2969 670/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 828us/step - accuracy: 0.8924 - loss: 0.2968 728/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 831us/step - accuracy: 0.8924 - loss: 0.2968 789/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 830us/step - accuracy: 0.8924 - loss: 0.2969 850/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 830us/step - accuracy: 0.8924 - loss: 0.2970 912/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 829us/step - accuracy: 0.8924 - loss: 0.2971 976/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 826us/step - accuracy: 0.8925 - loss: 0.29711039/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 824us/step - accuracy: 0.8926 - loss: 0.29701104/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 821us/step - accuracy: 0.8927 - loss: 0.29691168/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 820us/step - accuracy: 0.8928 - loss: 0.29671234/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 817us/step - accuracy: 0.8928 - loss: 0.29661300/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 814us/step - accuracy: 0.8929 - loss: 0.29651364/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 813us/step - accuracy: 0.8930 - loss: 0.29641429/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.8931 - loss: 0.29631492/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.8931 - loss: 0.29621555/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 810us/step - accuracy: 0.8932 - loss: 0.29601619/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 810us/step - accuracy: 0.8932 - loss: 0.29591684/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 808us/step - accuracy: 0.8933 - loss: 0.29591719/1719 ━━━━━━━━━━━━━━━━━━━━ 2s 866us/step - accuracy: 0.8933 - loss: 0.2958 - val_accuracy: 0.8666 - val_loss: 0.3539
Epoch 16/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 20s 12ms/step - accuracy: 0.9375 - loss: 0.2852  62/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 822us/step - accuracy: 0.9055 - loss: 0.2632 125/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 813us/step - accuracy: 0.9001 - loss: 0.2749 190/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 798us/step - accuracy: 0.8971 - loss: 0.2829 256/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 790us/step - accuracy: 0.8959 - loss: 0.2858 320/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 790us/step - accuracy: 0.8953 - loss: 0.2873 385/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 788us/step - accuracy: 0.8950 - loss: 0.2885 447/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 791us/step - accuracy: 0.8948 - loss: 0.2895 509/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 793us/step - accuracy: 0.8946 - loss: 0.2901 573/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 793us/step - accuracy: 0.8946 - loss: 0.2904 634/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 796us/step - accuracy: 0.8946 - loss: 0.2904 695/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 799us/step - accuracy: 0.8947 - loss: 0.2903 758/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 799us/step - accuracy: 0.8947 - loss: 0.2904 822/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 797us/step - accuracy: 0.8947 - loss: 0.2904 884/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 798us/step - accuracy: 0.8947 - loss: 0.2906 945/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 800us/step - accuracy: 0.8947 - loss: 0.29061004/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 803us/step - accuracy: 0.8947 - loss: 0.29061064/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.8948 - loss: 0.29051130/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 803us/step - accuracy: 0.8948 - loss: 0.29031195/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 801us/step - accuracy: 0.8949 - loss: 0.29021258/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 801us/step - accuracy: 0.8950 - loss: 0.29011321/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 801us/step - accuracy: 0.8950 - loss: 0.29001387/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 799us/step - accuracy: 0.8951 - loss: 0.28991451/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 799us/step - accuracy: 0.8952 - loss: 0.28981514/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 799us/step - accuracy: 0.8952 - loss: 0.28971577/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 799us/step - accuracy: 0.8952 - loss: 0.28961640/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 799us/step - accuracy: 0.8953 - loss: 0.28951704/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 798us/step - accuracy: 0.8953 - loss: 0.28941719/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 858us/step - accuracy: 0.8953 - loss: 0.2894 - val_accuracy: 0.8684 - val_loss: 0.3521
Epoch 17/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 21s 12ms/step - accuracy: 0.9062 - loss: 0.2779  60/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 858us/step - accuracy: 0.9059 - loss: 0.2570 117/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 868us/step - accuracy: 0.9015 - loss: 0.2673 141/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.9001 - loss: 0.2715   197/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step - accuracy: 0.8980 - loss: 0.2770 255/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 989us/step - accuracy: 0.8973 - loss: 0.2794 313/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 966us/step - accuracy: 0.8971 - loss: 0.2809 370/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 953us/step - accuracy: 0.8969 - loss: 0.2819 428/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 941us/step - accuracy: 0.8968 - loss: 0.2830 486/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 932us/step - accuracy: 0.8967 - loss: 0.2836 543/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 927us/step - accuracy: 0.8966 - loss: 0.2840 601/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 922us/step - accuracy: 0.8966 - loss: 0.2841 660/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 916us/step - accuracy: 0.8966 - loss: 0.2840 719/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 911us/step - accuracy: 0.8967 - loss: 0.2840 780/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 904us/step - accuracy: 0.8967 - loss: 0.2841 841/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 898us/step - accuracy: 0.8967 - loss: 0.2842 904/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 891us/step - accuracy: 0.8967 - loss: 0.2843 967/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 886us/step - accuracy: 0.8968 - loss: 0.28431029/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 881us/step - accuracy: 0.8968 - loss: 0.28431088/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 880us/step - accuracy: 0.8969 - loss: 0.28411149/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 877us/step - accuracy: 0.8970 - loss: 0.28401209/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 875us/step - accuracy: 0.8970 - loss: 0.28391268/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 874us/step - accuracy: 0.8971 - loss: 0.28381329/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 872us/step - accuracy: 0.8971 - loss: 0.28371391/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 869us/step - accuracy: 0.8972 - loss: 0.28361452/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 867us/step - accuracy: 0.8973 - loss: 0.28351513/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 865us/step - accuracy: 0.8973 - loss: 0.28341574/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 864us/step - accuracy: 0.8974 - loss: 0.28331638/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 861us/step - accuracy: 0.8974 - loss: 0.28321700/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 859us/step - accuracy: 0.8974 - loss: 0.28321719/1719 ━━━━━━━━━━━━━━━━━━━━ 2s 918us/step - accuracy: 0.8974 - loss: 0.2832 - val_accuracy: 0.8688 - val_loss: 0.3504
Epoch 18/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 20s 12ms/step - accuracy: 0.9062 - loss: 0.2749  63/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 817us/step - accuracy: 0.9087 - loss: 0.2517 127/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 802us/step - accuracy: 0.9034 - loss: 0.2630 192/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 794us/step - accuracy: 0.9003 - loss: 0.2705 254/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 799us/step - accuracy: 0.8994 - loss: 0.2732 314/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 808us/step - accuracy: 0.8990 - loss: 0.2747 372/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 818us/step - accuracy: 0.8988 - loss: 0.2758 401/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 886us/step - accuracy: 0.8986 - loss: 0.2764 462/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 879us/step - accuracy: 0.8985 - loss: 0.2772 523/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 873us/step - accuracy: 0.8984 - loss: 0.2777 585/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 867us/step - accuracy: 0.8984 - loss: 0.2779 646/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 863us/step - accuracy: 0.8984 - loss: 0.2779 709/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 858us/step - accuracy: 0.8985 - loss: 0.2779 769/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 856us/step - accuracy: 0.8986 - loss: 0.2780 829/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 855us/step - accuracy: 0.8986 - loss: 0.2781 887/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 856us/step - accuracy: 0.8986 - loss: 0.2782 948/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 854us/step - accuracy: 0.8986 - loss: 0.27831009/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 852us/step - accuracy: 0.8987 - loss: 0.27831071/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 850us/step - accuracy: 0.8987 - loss: 0.27821133/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 848us/step - accuracy: 0.8988 - loss: 0.27801195/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 846us/step - accuracy: 0.8989 - loss: 0.27791258/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 844us/step - accuracy: 0.8989 - loss: 0.27781321/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 842us/step - accuracy: 0.8990 - loss: 0.27781383/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 840us/step - accuracy: 0.8991 - loss: 0.27771446/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 839us/step - accuracy: 0.8992 - loss: 0.27761509/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 837us/step - accuracy: 0.8992 - loss: 0.27751572/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 836us/step - accuracy: 0.8993 - loss: 0.27741635/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 834us/step - accuracy: 0.8993 - loss: 0.27731697/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 833us/step - accuracy: 0.8993 - loss: 0.27731719/1719 ━━━━━━━━━━━━━━━━━━━━ 2s 892us/step - accuracy: 0.8993 - loss: 0.2772 - val_accuracy: 0.8700 - val_loss: 0.3484
Epoch 19/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 20s 12ms/step - accuracy: 0.9062 - loss: 0.2653  62/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 832us/step - accuracy: 0.9093 - loss: 0.2449 125/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 813us/step - accuracy: 0.9048 - loss: 0.2562 185/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 820us/step - accuracy: 0.9022 - loss: 0.2638 247/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 817us/step - accuracy: 0.9012 - loss: 0.2668 309/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 818us/step - accuracy: 0.9009 - loss: 0.2684 373/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 813us/step - accuracy: 0.9007 - loss: 0.2696 434/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 814us/step - accuracy: 0.9005 - loss: 0.2707 497/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 813us/step - accuracy: 0.9004 - loss: 0.2714 559/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 813us/step - accuracy: 0.9003 - loss: 0.2718 621/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 813us/step - accuracy: 0.9004 - loss: 0.2719 685/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 810us/step - accuracy: 0.9004 - loss: 0.2719 749/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 808us/step - accuracy: 0.9005 - loss: 0.2720 809/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 810us/step - accuracy: 0.9005 - loss: 0.2721 872/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 809us/step - accuracy: 0.9006 - loss: 0.2723 935/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 809us/step - accuracy: 0.9006 - loss: 0.2724 997/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 809us/step - accuracy: 0.9006 - loss: 0.27241059/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 809us/step - accuracy: 0.9007 - loss: 0.27231121/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 809us/step - accuracy: 0.9007 - loss: 0.27221184/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 809us/step - accuracy: 0.9008 - loss: 0.27211247/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 809us/step - accuracy: 0.9008 - loss: 0.27211309/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 809us/step - accuracy: 0.9009 - loss: 0.27201371/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 809us/step - accuracy: 0.9009 - loss: 0.27191433/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 810us/step - accuracy: 0.9010 - loss: 0.27181495/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 810us/step - accuracy: 0.9011 - loss: 0.27171556/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.9011 - loss: 0.27171618/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.9011 - loss: 0.27161680/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.9011 - loss: 0.27161719/1719 ━━━━━━━━━━━━━━━━━━━━ 2s 872us/step - accuracy: 0.9011 - loss: 0.2715 - val_accuracy: 0.8696 - val_loss: 0.3477
Epoch 20/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 20s 12ms/step - accuracy: 0.9062 - loss: 0.2643  60/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 849us/step - accuracy: 0.9122 - loss: 0.2403 122/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 828us/step - accuracy: 0.9072 - loss: 0.2509 185/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 819us/step - accuracy: 0.9043 - loss: 0.2588 247/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 818us/step - accuracy: 0.9033 - loss: 0.2617 310/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 814us/step - accuracy: 0.9030 - loss: 0.2632 372/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 814us/step - accuracy: 0.9028 - loss: 0.2644 434/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 813us/step - accuracy: 0.9027 - loss: 0.2655 497/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.9026 - loss: 0.2661 559/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.9026 - loss: 0.2665 622/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 810us/step - accuracy: 0.9027 - loss: 0.2666 684/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 810us/step - accuracy: 0.9028 - loss: 0.2666 748/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 808us/step - accuracy: 0.9029 - loss: 0.2667 810/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 808us/step - accuracy: 0.9030 - loss: 0.2668 874/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 806us/step - accuracy: 0.9030 - loss: 0.2670 936/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 807us/step - accuracy: 0.9031 - loss: 0.26711000/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.9031 - loss: 0.26711063/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.9032 - loss: 0.26701125/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.9033 - loss: 0.26691188/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.9033 - loss: 0.26681250/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.9034 - loss: 0.26681311/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 806us/step - accuracy: 0.9034 - loss: 0.26671376/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.9035 - loss: 0.26661438/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.9036 - loss: 0.26651502/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 804us/step - accuracy: 0.9036 - loss: 0.26651566/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 804us/step - accuracy: 0.9037 - loss: 0.26641628/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 804us/step - accuracy: 0.9037 - loss: 0.26641689/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.9037 - loss: 0.26631719/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 865us/step - accuracy: 0.9037 - loss: 0.2663 - val_accuracy: 0.8702 - val_loss: 0.3477
Epoch 21/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 20s 12ms/step - accuracy: 0.9062 - loss: 0.2537  62/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 823us/step - accuracy: 0.9128 - loss: 0.2355 125/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 809us/step - accuracy: 0.9081 - loss: 0.2463 189/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 801us/step - accuracy: 0.9060 - loss: 0.2537 251/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 803us/step - accuracy: 0.9055 - loss: 0.2565 315/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 800us/step - accuracy: 0.9054 - loss: 0.2581 377/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 803us/step - accuracy: 0.9054 - loss: 0.2592 440/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 802us/step - accuracy: 0.9053 - loss: 0.2603 504/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 800us/step - accuracy: 0.9053 - loss: 0.2610 567/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 800us/step - accuracy: 0.9053 - loss: 0.2613 632/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 797us/step - accuracy: 0.9054 - loss: 0.2614 695/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 797us/step - accuracy: 0.9056 - loss: 0.2614 758/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 798us/step - accuracy: 0.9056 - loss: 0.2615 822/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 797us/step - accuracy: 0.9057 - loss: 0.2617 886/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 796us/step - accuracy: 0.9057 - loss: 0.2619 950/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 796us/step - accuracy: 0.9058 - loss: 0.26201015/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 795us/step - accuracy: 0.9058 - loss: 0.26201080/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 793us/step - accuracy: 0.9059 - loss: 0.26191145/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 793us/step - accuracy: 0.9060 - loss: 0.26181211/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 791us/step - accuracy: 0.9060 - loss: 0.26171274/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 792us/step - accuracy: 0.9060 - loss: 0.26171335/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 793us/step - accuracy: 0.9061 - loss: 0.26161399/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 793us/step - accuracy: 0.9061 - loss: 0.26151465/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 791us/step - accuracy: 0.9062 - loss: 0.26141528/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 792us/step - accuracy: 0.9062 - loss: 0.26141592/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 792us/step - accuracy: 0.9062 - loss: 0.26131656/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 792us/step - accuracy: 0.9062 - loss: 0.26131719/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 850us/step - accuracy: 0.9062 - loss: 0.2612 - val_accuracy: 0.8708 - val_loss: 0.3487
Epoch 22/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 21s 12ms/step - accuracy: 0.9375 - loss: 0.2531  63/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 817us/step - accuracy: 0.9157 - loss: 0.2318 123/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 826us/step - accuracy: 0.9107 - loss: 0.2415 184/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 825us/step - accuracy: 0.9080 - loss: 0.2488 246/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 823us/step - accuracy: 0.9072 - loss: 0.2517 308/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 820us/step - accuracy: 0.9070 - loss: 0.2532 370/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 818us/step - accuracy: 0.9069 - loss: 0.2543 431/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 819us/step - accuracy: 0.9067 - loss: 0.2553 493/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 818us/step - accuracy: 0.9067 - loss: 0.2560 555/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 817us/step - accuracy: 0.9067 - loss: 0.2564 616/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 818us/step - accuracy: 0.9068 - loss: 0.2565 678/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 817us/step - accuracy: 0.9069 - loss: 0.2565 739/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 818us/step - accuracy: 0.9070 - loss: 0.2566 801/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 817us/step - accuracy: 0.9071 - loss: 0.2568 865/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 815us/step - accuracy: 0.9072 - loss: 0.2569 928/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 814us/step - accuracy: 0.9072 - loss: 0.2571 993/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.9073 - loss: 0.25711053/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 813us/step - accuracy: 0.9073 - loss: 0.25701114/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 814us/step - accuracy: 0.9074 - loss: 0.25691178/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 812us/step - accuracy: 0.9074 - loss: 0.25691241/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 812us/step - accuracy: 0.9075 - loss: 0.25681304/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.9075 - loss: 0.25671368/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.9076 - loss: 0.25671429/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.9076 - loss: 0.25661487/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 813us/step - accuracy: 0.9077 - loss: 0.25651547/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 814us/step - accuracy: 0.9077 - loss: 0.25651607/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 815us/step - accuracy: 0.9077 - loss: 0.25641669/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 815us/step - accuracy: 0.9077 - loss: 0.25641719/1719 ━━━━━━━━━━━━━━━━━━━━ 2s 874us/step - accuracy: 0.9077 - loss: 0.2564 - val_accuracy: 0.8718 - val_loss: 0.3475
Epoch 23/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 21s 12ms/step - accuracy: 0.9375 - loss: 0.2423  62/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 822us/step - accuracy: 0.9186 - loss: 0.2269 125/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 808us/step - accuracy: 0.9138 - loss: 0.2371 188/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 807us/step - accuracy: 0.9112 - loss: 0.2443 250/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 807us/step - accuracy: 0.9105 - loss: 0.2471 310/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 812us/step - accuracy: 0.9102 - loss: 0.2484 369/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 819us/step - accuracy: 0.9101 - loss: 0.2494 430/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 820us/step - accuracy: 0.9099 - loss: 0.2505 489/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 824us/step - accuracy: 0.9099 - loss: 0.2511 548/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 827us/step - accuracy: 0.9099 - loss: 0.2516 607/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 830us/step - accuracy: 0.9099 - loss: 0.2517 666/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 832us/step - accuracy: 0.9100 - loss: 0.2517 725/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 834us/step - accuracy: 0.9101 - loss: 0.2518 790/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 829us/step - accuracy: 0.9101 - loss: 0.2519 855/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 825us/step - accuracy: 0.9101 - loss: 0.2521 915/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 826us/step - accuracy: 0.9101 - loss: 0.2523 975/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 826us/step - accuracy: 0.9101 - loss: 0.25231036/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 826us/step - accuracy: 0.9101 - loss: 0.25231098/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 825us/step - accuracy: 0.9102 - loss: 0.25221160/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 824us/step - accuracy: 0.9102 - loss: 0.25211223/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 823us/step - accuracy: 0.9102 - loss: 0.25211286/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 822us/step - accuracy: 0.9102 - loss: 0.25201346/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 823us/step - accuracy: 0.9102 - loss: 0.25191410/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 821us/step - accuracy: 0.9102 - loss: 0.25191471/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 821us/step - accuracy: 0.9103 - loss: 0.25181534/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 821us/step - accuracy: 0.9103 - loss: 0.25171598/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 819us/step - accuracy: 0.9103 - loss: 0.25171660/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 819us/step - accuracy: 0.9103 - loss: 0.25171719/1719 ━━━━━━━━━━━━━━━━━━━━ 2s 879us/step - accuracy: 0.9102 - loss: 0.2516 - val_accuracy: 0.8718 - val_loss: 0.3480
Epoch 24/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 20s 12ms/step - accuracy: 0.9375 - loss: 0.2339  61/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 836us/step - accuracy: 0.9191 - loss: 0.2218 124/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 816us/step - accuracy: 0.9147 - loss: 0.2321 186/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 815us/step - accuracy: 0.9123 - loss: 0.2392 248/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 814us/step - accuracy: 0.9115 - loss: 0.2421 310/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 814us/step - accuracy: 0.9112 - loss: 0.2436 373/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 811us/step - accuracy: 0.9111 - loss: 0.2446 439/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 805us/step - accuracy: 0.9110 - loss: 0.2458 503/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 803us/step - accuracy: 0.9110 - loss: 0.2464 564/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 806us/step - accuracy: 0.9110 - loss: 0.2468 622/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 812us/step - accuracy: 0.9111 - loss: 0.2469 681/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 816us/step - accuracy: 0.9112 - loss: 0.2470 743/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 816us/step - accuracy: 0.9113 - loss: 0.2471 804/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 817us/step - accuracy: 0.9113 - loss: 0.2472 867/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 815us/step - accuracy: 0.9113 - loss: 0.2474 932/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 812us/step - accuracy: 0.9114 - loss: 0.2476 995/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.9114 - loss: 0.24761058/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.9114 - loss: 0.24761123/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 809us/step - accuracy: 0.9114 - loss: 0.24751183/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 810us/step - accuracy: 0.9115 - loss: 0.24741245/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 810us/step - accuracy: 0.9115 - loss: 0.24741308/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 810us/step - accuracy: 0.9115 - loss: 0.24731371/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 809us/step - accuracy: 0.9115 - loss: 0.24731435/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 808us/step - accuracy: 0.9115 - loss: 0.24721500/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 807us/step - accuracy: 0.9116 - loss: 0.24711564/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 806us/step - accuracy: 0.9116 - loss: 0.24711627/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 806us/step - accuracy: 0.9116 - loss: 0.24701689/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 806us/step - accuracy: 0.9116 - loss: 0.24701719/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 866us/step - accuracy: 0.9115 - loss: 0.2470 - val_accuracy: 0.8720 - val_loss: 0.3480
Epoch 25/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 20s 12ms/step - accuracy: 0.9375 - loss: 0.2264  61/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 839us/step - accuracy: 0.9210 - loss: 0.2172 125/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 813us/step - accuracy: 0.9166 - loss: 0.2275 189/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 803us/step - accuracy: 0.9143 - loss: 0.2347 251/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 805us/step - accuracy: 0.9137 - loss: 0.2376 314/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 803us/step - accuracy: 0.9136 - loss: 0.2390 379/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 797us/step - accuracy: 0.9135 - loss: 0.2402 443/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 795us/step - accuracy: 0.9134 - loss: 0.2413 507/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 794us/step - accuracy: 0.9133 - loss: 0.2419 569/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 795us/step - accuracy: 0.9133 - loss: 0.2423 632/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 796us/step - accuracy: 0.9134 - loss: 0.2425 698/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 793us/step - accuracy: 0.9135 - loss: 0.2425 763/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 792us/step - accuracy: 0.9136 - loss: 0.2427 826/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 793us/step - accuracy: 0.9136 - loss: 0.2428 888/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 794us/step - accuracy: 0.9136 - loss: 0.2431 952/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 794us/step - accuracy: 0.9136 - loss: 0.24321018/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 791us/step - accuracy: 0.9136 - loss: 0.24321085/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 789us/step - accuracy: 0.9137 - loss: 0.24311151/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 787us/step - accuracy: 0.9137 - loss: 0.24311217/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 786us/step - accuracy: 0.9137 - loss: 0.24301283/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 785us/step - accuracy: 0.9137 - loss: 0.24301348/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 784us/step - accuracy: 0.9137 - loss: 0.24291415/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 783us/step - accuracy: 0.9138 - loss: 0.24281480/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 782us/step - accuracy: 0.9138 - loss: 0.24271545/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 782us/step - accuracy: 0.9138 - loss: 0.24271609/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 782us/step - accuracy: 0.9138 - loss: 0.24271674/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 782us/step - accuracy: 0.9138 - loss: 0.24261719/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 843us/step - accuracy: 0.9137 - loss: 0.2426 - val_accuracy: 0.8730 - val_loss: 0.3477
Epoch 26/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 19s 12ms/step - accuracy: 0.9375 - loss: 0.2250  62/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 826us/step - accuracy: 0.9221 - loss: 0.2132 125/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 811us/step - accuracy: 0.9181 - loss: 0.2231 187/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 813us/step - accuracy: 0.9159 - loss: 0.2301 250/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 809us/step - accuracy: 0.9153 - loss: 0.2331 316/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 799us/step - accuracy: 0.9150 - loss: 0.2346 380/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 796us/step - accuracy: 0.9149 - loss: 0.2358 445/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 793us/step - accuracy: 0.9148 - loss: 0.2369 508/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 795us/step - accuracy: 0.9147 - loss: 0.2375 574/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 791us/step - accuracy: 0.9147 - loss: 0.2379 638/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.9148 - loss: 0.2381 702/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.9149 - loss: 0.2381 767/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 789us/step - accuracy: 0.9149 - loss: 0.2383 831/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 788us/step - accuracy: 0.9150 - loss: 0.2385 893/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.9150 - loss: 0.2387 956/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 791us/step - accuracy: 0.9150 - loss: 0.23881021/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 790us/step - accuracy: 0.9150 - loss: 0.23891083/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 791us/step - accuracy: 0.9150 - loss: 0.23881146/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 792us/step - accuracy: 0.9150 - loss: 0.23871210/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 791us/step - accuracy: 0.9150 - loss: 0.23871274/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 791us/step - accuracy: 0.9150 - loss: 0.23861335/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 793us/step - accuracy: 0.9151 - loss: 0.23861398/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 793us/step - accuracy: 0.9151 - loss: 0.23851462/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 793us/step - accuracy: 0.9151 - loss: 0.23841523/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 795us/step - accuracy: 0.9151 - loss: 0.23841587/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 794us/step - accuracy: 0.9151 - loss: 0.23841651/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 794us/step - accuracy: 0.9151 - loss: 0.23831713/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 795us/step - accuracy: 0.9150 - loss: 0.23831719/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 855us/step - accuracy: 0.9150 - loss: 0.2383 - val_accuracy: 0.8740 - val_loss: 0.3481
Epoch 27/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 20s 12ms/step - accuracy: 0.9375 - loss: 0.2117  62/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 827us/step - accuracy: 0.9264 - loss: 0.2087 125/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 815us/step - accuracy: 0.9217 - loss: 0.2188 185/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 823us/step - accuracy: 0.9194 - loss: 0.2257 245/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 828us/step - accuracy: 0.9187 - loss: 0.2286 307/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 826us/step - accuracy: 0.9184 - loss: 0.2302 368/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 825us/step - accuracy: 0.9183 - loss: 0.2313 432/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 819us/step - accuracy: 0.9181 - loss: 0.2324 492/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 822us/step - accuracy: 0.9179 - loss: 0.2331 542/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 839us/step - accuracy: 0.9178 - loss: 0.2335 597/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 846us/step - accuracy: 0.9178 - loss: 0.2338 648/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 857us/step - accuracy: 0.9178 - loss: 0.2338 687/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 900us/step - accuracy: 0.9178 - loss: 0.2339 701/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 954us/step - accuracy: 0.9178 - loss: 0.2339 735/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 978us/step - accuracy: 0.9178 - loss: 0.2340 772/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 998us/step - accuracy: 0.9178 - loss: 0.2341 787/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.9178 - loss: 0.2342   815/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.9178 - loss: 0.2342 867/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.9177 - loss: 0.2344 886/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.9177 - loss: 0.2345 938/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.9177 - loss: 0.2346 998/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.9177 - loss: 0.23471058/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.9176 - loss: 0.23461116/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.9176 - loss: 0.23461177/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.9176 - loss: 0.23451237/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.9176 - loss: 0.23451294/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.9176 - loss: 0.23441354/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.9176 - loss: 0.23441414/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.9176 - loss: 0.23431472/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.9175 - loss: 0.23431529/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 999us/step - accuracy: 0.9175 - loss: 0.23431576/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.9175 - loss: 0.2342  1629/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 1000us/step - accuracy: 0.9175 - loss: 0.23421672/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.9174 - loss: 0.2342   1719/1719 ━━━━━━━━━━━━━━━━━━━━ 2s 1ms/step - accuracy: 0.9174 - loss: 0.2342 - val_accuracy: 0.8736 - val_loss: 0.3492
Epoch 28/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 22s 13ms/step - accuracy: 0.9375 - loss: 0.2105  58/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 884us/step - accuracy: 0.9285 - loss: 0.2034 118/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 860us/step - accuracy: 0.9244 - loss: 0.2130 179/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 848us/step - accuracy: 0.9216 - loss: 0.2207 239/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 845us/step - accuracy: 0.9208 - loss: 0.2239 300/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 840us/step - accuracy: 0.9205 - loss: 0.2256 364/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 832us/step - accuracy: 0.9203 - loss: 0.2267 424/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 833us/step - accuracy: 0.9201 - loss: 0.2279 485/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 833us/step - accuracy: 0.9199 - loss: 0.2286 547/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 831us/step - accuracy: 0.9197 - loss: 0.2292 607/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 832us/step - accuracy: 0.9197 - loss: 0.2294 668/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 832us/step - accuracy: 0.9197 - loss: 0.2295 729/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 831us/step - accuracy: 0.9197 - loss: 0.2297 791/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 829us/step - accuracy: 0.9196 - loss: 0.2299 852/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 829us/step - accuracy: 0.9196 - loss: 0.2301 916/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 826us/step - accuracy: 0.9195 - loss: 0.2303 980/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 824us/step - accuracy: 0.9195 - loss: 0.23041040/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 824us/step - accuracy: 0.9194 - loss: 0.23041101/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 824us/step - accuracy: 0.9194 - loss: 0.23031165/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 822us/step - accuracy: 0.9194 - loss: 0.23031226/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 822us/step - accuracy: 0.9193 - loss: 0.23031287/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 823us/step - accuracy: 0.9193 - loss: 0.23021347/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 823us/step - accuracy: 0.9193 - loss: 0.23021408/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 823us/step - accuracy: 0.9193 - loss: 0.23011469/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 823us/step - accuracy: 0.9193 - loss: 0.23011530/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 823us/step - accuracy: 0.9192 - loss: 0.23001590/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 824us/step - accuracy: 0.9192 - loss: 0.23001651/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 824us/step - accuracy: 0.9192 - loss: 0.23001712/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 824us/step - accuracy: 0.9191 - loss: 0.23001719/1719 ━━━━━━━━━━━━━━━━━━━━ 2s 884us/step - accuracy: 0.9191 - loss: 0.2300 - val_accuracy: 0.8728 - val_loss: 0.3507
Epoch 29/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 21s 13ms/step - accuracy: 0.9375 - loss: 0.2022  61/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 835us/step - accuracy: 0.9302 - loss: 0.1998 124/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 818us/step - accuracy: 0.9259 - loss: 0.2099 187/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 812us/step - accuracy: 0.9234 - loss: 0.2171 250/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 809us/step - accuracy: 0.9228 - loss: 0.2202 313/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 808us/step - accuracy: 0.9224 - loss: 0.2218 375/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 808us/step - accuracy: 0.9222 - loss: 0.2230 437/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 809us/step - accuracy: 0.9219 - loss: 0.2241 498/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.9217 - loss: 0.2248 559/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 813us/step - accuracy: 0.9216 - loss: 0.2253 622/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 811us/step - accuracy: 0.9215 - loss: 0.2255 687/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 808us/step - accuracy: 0.9214 - loss: 0.2256 750/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 807us/step - accuracy: 0.9213 - loss: 0.2258 814/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.9212 - loss: 0.2260 877/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.9212 - loss: 0.2263 940/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.9211 - loss: 0.22641001/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 806us/step - accuracy: 0.9210 - loss: 0.22651062/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 807us/step - accuracy: 0.9209 - loss: 0.22651125/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 807us/step - accuracy: 0.9209 - loss: 0.22641188/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 806us/step - accuracy: 0.9209 - loss: 0.22641252/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.9208 - loss: 0.22641315/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 805us/step - accuracy: 0.9208 - loss: 0.22631379/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 804us/step - accuracy: 0.9208 - loss: 0.22631445/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 802us/step - accuracy: 0.9208 - loss: 0.22621507/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 803us/step - accuracy: 0.9207 - loss: 0.22621571/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 802us/step - accuracy: 0.9207 - loss: 0.22611633/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 802us/step - accuracy: 0.9206 - loss: 0.22611695/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 803us/step - accuracy: 0.9206 - loss: 0.22611719/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 863us/step - accuracy: 0.9206 - loss: 0.2261 - val_accuracy: 0.8716 - val_loss: 0.3510
Epoch 30/30
   1/1719 ━━━━━━━━━━━━━━━━━━━━ 21s 12ms/step - accuracy: 0.9375 - loss: 0.1994  59/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 876us/step - accuracy: 0.9328 - loss: 0.1962 120/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 849us/step - accuracy: 0.9288 - loss: 0.2057 181/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 841us/step - accuracy: 0.9260 - loss: 0.2132 242/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 838us/step - accuracy: 0.9250 - loss: 0.2163 301/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 842us/step - accuracy: 0.9246 - loss: 0.2179 361/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 841us/step - accuracy: 0.9244 - loss: 0.2190 420/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 842us/step - accuracy: 0.9241 - loss: 0.2201 478/1719 ━━━━━━━━━━━━━━━━━━━━ 1s 845us/step - accuracy: 0.9239 - loss: 0.2208 539/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 844us/step - accuracy: 0.9236 - loss: 0.2213 601/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 840us/step - accuracy: 0.9235 - loss: 0.2216 663/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 837us/step - accuracy: 0.9234 - loss: 0.2217 725/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 835us/step - accuracy: 0.9233 - loss: 0.2219 786/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 835us/step - accuracy: 0.9232 - loss: 0.2221 844/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 837us/step - accuracy: 0.9231 - loss: 0.2223 903/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 838us/step - accuracy: 0.9229 - loss: 0.2225 962/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 839us/step - accuracy: 0.9229 - loss: 0.22261021/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 840us/step - accuracy: 0.9228 - loss: 0.22261080/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 840us/step - accuracy: 0.9227 - loss: 0.22261138/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 841us/step - accuracy: 0.9227 - loss: 0.22261199/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 840us/step - accuracy: 0.9226 - loss: 0.22251261/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 839us/step - accuracy: 0.9225 - loss: 0.22251321/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 839us/step - accuracy: 0.9225 - loss: 0.22241378/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 841us/step - accuracy: 0.9225 - loss: 0.22241436/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 842us/step - accuracy: 0.9225 - loss: 0.22231497/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 842us/step - accuracy: 0.9224 - loss: 0.22231559/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 841us/step - accuracy: 0.9224 - loss: 0.22231621/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 840us/step - accuracy: 0.9224 - loss: 0.22231681/1719 ━━━━━━━━━━━━━━━━━━━━ 0s 840us/step - accuracy: 0.9223 - loss: 0.22231719/1719 ━━━━━━━━━━━━━━━━━━━━ 2s 901us/step - accuracy: 0.9223 - loss: 0.2223 - val_accuracy: 0.8726 - val_loss: 0.3522

The model is provided with both a taining set and a validation set. At each step, the model will report its performance on both sets. This will also allow to visualize the accuracy and loss curves on both sets (more later).

When calling the fit method in Keras (or similar frameworks), each step corresponds to the evaluation of a mini-batch. A mini-batch is a subset of the training data, and during each step, the model updates its weights based on the error calculated from this mini-batch.

An epoch is defined as one complete pass through the entire training dataset. During an epoch, the model processes multiple mini-batches until it has seen all the training data once. This process is repeated for a specified number of epochs to optimize the model’s performance.

Visualization

import pandas as pd 

pd.DataFrame(history.history).plot(
    figsize=(8, 5), xlim=[0, 29], ylim=[0, 1], grid=True, xlabel="Epoch",
    style=["r--", "r--.", "b-", "b-*"])
plt.legend(loc="lower left")  # extra code
plt.show()

Evaluating the Model on our Test Set

model.evaluate(X_test, y_test)
  1/313 ━━━━━━━━━━━━━━━━━━━━ 3s 11ms/step - accuracy: 0.8750 - loss: 0.6139104/313 ━━━━━━━━━━━━━━━━━━━━ 0s 487us/step - accuracy: 0.8755 - loss: 0.3707214/313 ━━━━━━━━━━━━━━━━━━━━ 0s 471us/step - accuracy: 0.8704 - loss: 0.3792313/313 ━━━━━━━━━━━━━━━━━━━━ 0s 493us/step - accuracy: 0.8700 - loss: 0.3788
[0.37556707859039307, 0.8698999881744385]

Making Predictions

X_new = X_test[:3]
y_proba = model.predict(X_new)
y_proba.round(2)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 30ms/step
array([[0.  , 0.  , 0.  , 0.  , 0.  , 0.4 , 0.  , 0.01, 0.  , 0.59],
       [0.  , 0.  , 1.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  ],
       [0.  , 1.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  ]],
      dtype=float32)

. . .

y_pred = y_proba.argmax(axis=-1)
y_pred
array([9, 2, 1])

. . .

y_new = y_test[:3]
y_new
array([9, 2, 1], dtype=uint8)

As can be seen, the predictions are unambiguous, with only one class per prediction exhibiting a high value.

Predicted vs Observed

Code
plt.figure(figsize=(7.2, 2.4))
for index, image in enumerate(X_new):
    plt.subplot(1, 3, index + 1)
    plt.imshow(image, cmap="binary", interpolation="nearest")
    plt.axis('off')
    plt.title(class_names[y_test[index]])
plt.subplots_adjust(wspace=0.2, hspace=0.5)
plt.show()

np.array(class_names)[y_pred]
array(['Ankle boot', 'Pullover', 'Trouser'], dtype='<U11')

Test Set Performance

from sklearn.metrics import classification_report

y_proba = model.predict(X_test)
y_pred = y_proba.argmax(axis=-1)

Test Set Performance

print(classification_report(y_test, y_pred))
              precision    recall  f1-score   support

           0       0.85      0.81      0.83      1000
           1       0.98      0.97      0.98      1000
           2       0.76      0.83      0.79      1000
           3       0.81      0.94      0.87      1000
           4       0.82      0.80      0.81      1000
           5       0.85      0.99      0.91      1000
           6       0.76      0.63      0.69      1000
           7       0.94      0.89      0.92      1000
           8       0.96      0.95      0.96      1000
           9       0.98      0.90      0.94      1000

    accuracy                           0.87     10000
   macro avg       0.87      0.87      0.87     10000
weighted avg       0.87      0.87      0.87     10000

Prologue

Summary

  • Neural Networks Foundations:
    We introduced bio-inspired computation with neurodes and threshold logic units, outlining the perceptron model and its limitations (e.g., the XOR problem).

  • From Perceptrons to Deep Networks:
    We explained the evolution to multilayer perceptrons (MLPs) and feedforward architectures, emphasizing the critical role of nonlinear activation functions (sigmoid, tanh, ReLU) in enabling gradient-based learning and complex function approximation.

  • Universal Approximation:
    We discussed how even single hidden layer networks can approximate any continuous function on a compact set, highlighting the theoretical underpinning of deep learning.

  • Practical Frameworks and Applications:
    Finally, we reviews leading deep learning frameworks (PyTorch, TensorFlow, Keras) and demonstrates practical model-building using the Fashion-MNIST dataset, covering model training, evaluation, and prediction.

3Blue1Brown on Deep Learning

Next lecture

  • Training Deep Learning Models

References

Cybenko, George V. 1989. “Approximation by Superpositions of a Sigmoidal Function.” Mathematics of Control, Signals and Systems 2: 303–14. https://api.semanticscholar.org/CorpusID:3958369.
D’haeseleer, Patrik. 2006. “How Does DNA Sequence Motif Discovery Work?” Nature Biotechnology 24 (8): 959–61. https://doi.org/10.1038/nbt0806-959.
Géron, Aurélien. 2022. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow. 3rd ed. O’Reilly Media, Inc.
Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. Adaptive Computation and Machine Learning. MIT Press. https://dblp.org/rec/books/daglib/0040158.
Hornik, Kurt, Maxwell Stinchcombe, and Halbert White. 1989. “Multilayer Feedforward Networks Are Universal Approximators.” Neural Networks 2 (5): 359–66. https://doi.org/https://doi.org/10.1016/0893-6080(89)90020-8.
LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. 2015. “Deep Learning.” Nature 521 (7553): 436–44. https://doi.org/10.1038/nature14539.
LeNail, Alexander. 2019. NN-SVG: Publication-Ready Neural Network Architecture Schematics.” Journal of Open Source Software 4 (33): 747. https://doi.org/10.21105/joss.00747.
McCulloch, Warren S, and Walter Pitts. 1943. A logical calculus of the ideas immanent in nervous activity.” The Bulletin of Mathematical Biophysics 5 (4): 115–33. https://doi.org/10.1007/bf02478259.
Minsky, Marvin, and Seymour Papert. 1969. Perceptrons: An Introduction to Computational Geometry. Cambridge, MA, USA: MIT Press.
Rosenblatt, F. 1958. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review 65 (6): 386–408. https://doi.org/10.1037/h0042519.
Wasserman, WW, and A Sandelin. 2004. Applied bioinformatics for the identification of regulatory elements.” Nature Reviews Genetics 5 (4): 276–87. https://doi.org/10.1038/nrg1315.
Zou, James, Mikael Huss, Abubakar Abid, Pejman Mohammadi, Ali Torkamani, and Amalio Telenti. 2019. “A Primer on Deep Learning in Genomics.” Nature Genetics 51 (1): 12–18. https://doi.org/10.1038/s41588-018-0295-5.

Marcel Turcotte

Marcel.Turcotte@uOttawa.ca

School of Electrical Engineering and Computer Science (EECS)

University of Ottawa