First steps with the Keras Sequential API

This tutorial is based on the companion notebook for the excellent book Deep Learning with Python, Second Edition by François Chollet. The original code can be found here.

TensorFlow is an open source platform for machine learning provided by Google (installation tutorial for TensorFlow 2).

Built on top of TensorFlow 2, Keras is a central part of the tightly-connected TensorFlow 2 ecosystem, covering every step of the machine learning workflow, from data management to hyperparameter training to deployment solutions. Keras is used by CERN (e.g., at the LHC), NASA and many more scientific organizations around the world. Furthermore, it is one of the most used deep learning frameworks among top winning teams on Kaggle.

The Sequential model is the most approachable API since it is basically a Python list. As such, it’s limited to simple (sequential) stacks of layers.

Setup

from tensorflow import keras
from tensorflow.keras import layers

Sequential class

model = keras.Sequential([
    layers.Dense(64, activation="relu"),
    layers.Dense(10, activation="softmax")
])

2022-03-31 17:46:08.350204: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

Incrementally building

model = keras.Sequential()
model.add(layers.Dense(64, activation="relu"))
model.add(layers.Dense(10, activation="softmax"))

Build a model

As input, we use input_shape = (None, 3):

This means the number of samples per batch is variable (indicated by the None batch size).
The model will process batches where each sample has shape (3,1), i.e. a simple array with 3 values.

model.build(input_shape=(None, 3))
model.weights

[<tf.Variable 'dense_2/kernel:0' shape=(3, 64) dtype=float32, numpy=
 array([[ 0.01605183,  0.02497053, -0.05627322, -0.29151097,  0.29616302,
          0.07873288, -0.00581038,  0.10026461,  0.14770958, -0.14038883,
         -0.00768718, -0.2566623 , -0.17545176, -0.22023511,  0.25317138,
     ...
          0.15829647, -0.24911498,  0.01689771,  0.03298521, -0.02120081,
         -0.01399088, -0.24136557, -0.11882029, -0.09802602, -0.01723498,
          0.25581425,  0.23705328, -0.11338083, -0.1720188 , -0.08666615,
         -0.2735188 ,  0.05390176, -0.27297997, -0.11028223, -0.07292457,
          0.1069324 , -0.09087694, -0.03540394, -0.29637894, -0.18628278,
         -0.17684439,  0.24332768, -0.17426789, -0.13252178, -0.05520386,
         -0.22802547,  0.019539  , -0.09935986, -0.21454728,  0.00119129,
         -0.03409687,  0.08448008,  0.1251595 ,  0.09959957, -0.18673313,
         -0.28645173, -0.18432245, -0.16391908, -0.07870634]],
       dtype=float32)>,
 <tf.Variable 'dense_2/bias:0' shape=(64,) dtype=float32, numpy=
 array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)>,
 <tf.Variable 'dense_3/kernel:0' shape=(64, 10) dtype=float32, numpy=
 array([[ 0.27202836, -0.18021867,  0.11120814,  0.04545045,  0.11429551,
         -0.18513525,  0.20359612,  0.06546766,  0.2585329 , -0.25263438],
        [ 0.07973722,  0.05218545,  0.12635207,  0.2838101 ,  0.15396196,
          0.20077297,  0.11637965, -0.07823259,  0.10132465,  0.12437168],
    ...
        [ 0.21105409,  0.15429965,  0.25718793, -0.1988354 , -0.04730266,
          0.11784518,  0.03440061,  0.06892404,  0.09021738, -0.20241818],
        [-0.24422699, -0.22418573, -0.1813173 ,  0.09590444,  0.16083878,
         -0.0923643 , -0.2800508 ,  0.1309481 , -0.18422282,  0.10825163]],
       dtype=float32)>,
 <tf.Variable 'dense_3/bias:0' shape=(10,) dtype=float32, numpy=array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)>]

Model summary

model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense_2 (Dense)             (None, 64)                256       
                                                                 
 dense_3 (Dense)             (None, 10)                650       
                                                                 
=================================================================
Total params: 906
Trainable params: 906
Non-trainable params: 0
_________________________________________________________________

Naming models and layers

model = keras.Sequential(name="my_example_model")
model.add(layers.Dense(64, activation="relu", name="my_first_layer"))
model.add(layers.Dense(10, activation="softmax", name="my_last_layer"))
model.build((None, 3))
model.summary()

Model: "my_example_model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 my_first_layer (Dense)      (None, 64)                256       
                                                                 
 my_last_layer (Dense)       (None, 10)                650       
                                                                 
=================================================================
Total params: 906
Trainable params: 906
Non-trainable params: 0
_________________________________________________________________

Specifying input shape

Use Input to declare the shape of the inputs. Note that the shape argument must be the shape of each sample, not the shape of one batch.

model = keras.Sequential()
model.add(keras.Input(shape=(3,)))
model.add(layers.Dense(64, activation="relu"))

model.summary()

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense_4 (Dense)             (None, 64)                256       
                                                                 
=================================================================
Total params: 256
Trainable params: 256
Non-trainable params: 0
_________________________________________________________________

model.add(layers.Dense(10, activation="softmax"))
model.summary()

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense_4 (Dense)             (None, 64)                256       
                                                                 
 dense_5 (Dense)             (None, 10)                650       
                                                                 
=================================================================
Total params: 906
Trainable params: 906
Non-trainable params: 0
_________________________________________________________________