Building Deep Autoencoders with Keras and TensorFlow (2023)

In this tutorial, we will explore how to build and train deep autoencoders using Keras and Tensorflow.

The primary reason I decided to write this tutorial is that most of the tutorials out there, including the official Keras and TensorFlow ones, use the MNIST data for the training. I have been asked numerous times to show how to train autoencoders using our own images that may be large in number.

I will try to keep this tutorial brief and will not get into the details of how autoencoder works. Therefore, having a basic knowledge of autoencoders is the prerequisite to understand the code presented in this tutorial (needless to say that you must know how to program in Python, Keras and TensorFlow).

Building Deep Autoencoders with Keras and TensorFlow (1)

Autoencoders are unsupervised neural networks that learn to reconstruct its input. Denoising an image is one of the uses of autoencoders. Denoising is very useful for OCR. Autoencoders are also also used for image compression.

As shown in Figure 1, an autoencoder consists of:

  1. Encoder: The encoder takes an image as input and generates an output which is much smaller dimension compared to the original image. The output from the encoders is also called as the latent representation of the input image.
  2. Decoder: The decoder takes the output from the encoder (aka the latent representation of the input image) and reconstructs the input image.

Both encoders and decoders are convolutional neural networks with the difference that the encoders dimensions reduce with each layer and the decoders dimensions increase with each layer until the output layer where the dimensions match with the original image.

We will use our own images for training and testing the autoencoders. For the purpose of this tutorial, we will use a dataset that contains scanned images of restaurant receipts. The dataset is freely available from the link https://expressexpense.com/large-receipt-image-dataset-SRD.zip uner MIT License.

Although this dataset does not have a large number of images, we will write code that will work for both small and large datasets.

The code below is divided into 4 parts.

  1. Data preparation: Images will be read from a directory and fed as inputs to the encoder block.
  2. Neural network configuration: We will write a function that takes certain parameters and return the encoder, decoder and autoencoder convolutional neural networks
  3. Training the neural networks: The code that triggers the training, monitors the progress and saves the trained models.
  4. Prediction: The code block that uses the trained models and predicts the output.

I will use Google Colaboratory (https://colab.research.google.com/) to execute the code. You can use your favorite IDE to write and run the code. The code below works both for CPUs and GPUs, I will use the GPU based machine to speed up the training. Google Colab offers a free GPU based virtual machine for education and learning.

If you use a Jupyter notebook, the steps below will look very similar.

First we create a notebook project, AE Demo for example.

Before we start the actual code, let’s import all dependencies that we need for our project. Here is a list of imports that we will need.

# Import the necessary packages

import tensorflow as tf

from google.colab.patches import cv2_imshow

from tensorflow.keras.layers import BatchNormalization

from tensorflow.keras.layers import Conv2D

from tensorflow.keras.layers import Conv2DTranspose

from tensorflow.keras.layers import LeakyReLU

from tensorflow.keras.layers import Activation

from tensorflow.keras.layers import Flatten

from tensorflow.keras.layers import Dense

from tensorflow.keras.layers import Reshape

from tensorflow.keras.layers import Input

from tensorflow.keras.models import Model

from tensorflow.keras import backend as K

from tensorflow.keras.optimizers import Adam

import numpy as np

(Video) Autoencoders in Python with Tensorflow/Keras

Listing 1.1: Import the necessary packages.

Our receipt images are in a directory. We will use ImageDataGenerator class, provided by Keras API, and create training and test iterators as shown in the listing 1.2 below.

trainig_img_dir = “inputs”

height = 1000

width = 500

channel = 1

batch_size = 8

datagen = tf.keras.preprocessing.image.ImageDataGenerator(validation_split=0.2, rescale=1. / 255.)

train_it = datagen.flow_from_directory(

trainig_img_dir,

target_size=(height, width),

color_mode=’grayscale’,

class_mode=’input’,

batch_size=batch_size,

subset=’training’) # set as training data

val_it = datagen.flow_from_directory(

trainig_img_dir,

target_size=(height, width),

color_mode=’grayscale’,

class_mode=’input’,

batch_size=batch_size,

subset=’validation’) # set as validation data

Listing 1.2: Image input preparation. Load images in batches from a directory.

Important notes about Listing 1.2:

  1. training_img_dir = “inputs” is the parent directory that contains the receipt images. In other words, receipts are in a subdirectory under the “inputs” directory.
  2. color_mode=’grayscale’ is important if you want to convert your input images into grayscale.

All other parameters are self explanatory.

As shown in Listing 1.3 below, we have created an AutoencoderBuilder class that provides a function build_ae(). This function takes the following arguments:

  • height of the input images,
  • width of the input images,
  • depth (or the number of channels) of the input images.
  • filters as a tuple with the default as (32,64)
  • latentDim which represents the dimension of the latent vector

class AutoencoderBuilder:

@staticmethod

def build_ae(height, width, depth, filters=(32, 64), latentDim=16):

#Initialize the input shape.

inputShape = (height, width, depth)

(Video) Building and Training an Autoencoder in Keras + TensorFlow + Python

chanDim = -1

# define the input to the encoder

inputs = Input(shape=inputShape)

x = inputs

# loop over the filters

for filter in filters:

# Build network with Convolutional with RELU and BatchNormalization

x = Conv2D(filter, (3, 3), strides=2, padding=”same”)(x)

x = LeakyReLU(alpha=0.2)(x)

x = BatchNormalization(axis=chanDim)(x)

# flatten the network and then construct the latent vector

volumeSize = K.int_shape(x)

x = Flatten()(x)

latent = Dense(latentDim)(x)

# build the encoder model

encoder = Model(inputs, latent, name=”encoder”)

# We will now build the the decoder model which takes the output from the encoder as its inputs

latentInputs = Input(shape=(latentDim,))

x = Dense(np.prod(volumeSize[1:]))(latentInputs)

x = Reshape((volumeSize[1], volumeSize[2], volumeSize[3]))(x)

# We will loop over the filters again but in the reverse order

for filter in filters[::-1]:

# In the decoder, we will apply a CONV_TRANSPOSE with RELU and BatchNormalization operation

x = Conv2DTranspose(filter, (3, 3), strides=2,

padding=”same”)(x)

x = LeakyReLU(alpha=0.2)(x)

x = BatchNormalization(axis=chanDim)(x)

# Now, we want to recover the original depth of the image. For this, we apply a single CONV_TRANSPOSE layer

x = Conv2DTranspose(depth, (3, 3), padding=”same”)(x)

outputs = Activation(“sigmoid”)(x)

# Now build the decoder model

(Video) Autoencoders in Keras and TensorFlow for Data Compression and Reconstruction - Neural Networks

decoder = Model(latentInputs, outputs, name=”decoder”)

# Finally, the autoencoder is the encoder + decoder

autoencoder = Model(inputs, decoder(encoder(inputs)),

name=”autoencoder”)

# return a tuple of the encoder, decoder, and autoencoder models

return (encoder, decoder, autoencoder)

Listing 1.3: Builder class to create autoencoder networks.

The following code Listing 1.4 starts the autoencoder training.

# initialize the number of epochs to train for and batch size

EPOCHS = 300

BATCHES = 8

MODEL_OUT_DIR = “ae_model_dir”

# construct our convolutional autoencoder

print(“[INFO] building autoencoder…”)

(encoder, decoder, autoencoder) = AutoencoderBuilder().build_ae(height,width,channel)

opt = Adam(lr=1e-3)

autoencoder.compile(loss=”mse”, optimizer=opt)

# train the convolutional autoencoder

history = autoencoder.fit(

train_it,

validation_data=val_it,

epochs=EPOCHS,

batch_size=BATCHES)

autoencoder.save(MODEL_OUT_DIR+”/ae_model.h5”)

Listing 1.4: Training autoencoder model.

The code listing 1.5 shows how to display a graph of loss/accuracy per epoch of both training and validation. Figure 2 shows a sample output of the code Listing 1.5

# set the matplotlib backend so figures can be saved in the background

import matplotlib

import matplotlib.pyplot as plt

%matplotlib inline

# construct a plot that plots and displays the training history

(Video) Simple Autoencoder in TensorFlow 2.0 (Keras) | Deep Learning | Machine Learning

N = np.arange(0, EPOCHS)

plt.style.use(“ggplot”)

plt.figure()

plt.plot(N, history.history[“loss”], label=”train_loss”)

plt.plot(N, history.history[“val_loss”], label=”val_loss”)

plt.title(“Training Loss and Accuracy”)

plt.xlabel(“Epoch #”)

plt.ylabel(“Loss/Accuracy”)

plt.legend(loc=”lower left”)

# plt.savefig(plot)

plt.show(block=True)

Listing 1.5: Display a plot of training loss and accuracy vs epochs

Building Deep Autoencoders with Keras and TensorFlow (2)

Figure 1.2: Plot of loss/accuracy vs epoch

Now that we have a trained autoencoder model, we will use it to make predictions. The code listing 1.6 shows how to load the model from the directory location where it was saved. We use predict() function and pass the validation image iterator that we created before. Ideally we should have a different image set for prediction and testing.

Here is the code to do the prediction and display.

from google.colab.patches import cv2_imshow

# use the convolutional autoencoder to make predictions on the

# validation images, then display those predicted image.

print(“[INFO] making predictions…”)

autoencoder_model = tf.keras.models.load_model(MODEL_OUT_DIR+”/encoder_decoder_model.h5")

decoded = autoencoder_model.predict(train_it)

decoded = autoencoder.predict(val_it)

examples = 10

# loop over a few samples to display the predicted images

for i in range(0, examples):

predicted = (decoded[i] * 255).astype(“uint8”)

cv2_imshow(predicted)

Listing 1.6: Code to predict and display the images

In the above code listing, I have used the cv2_imshow package which is very specific to Google Colab. If you are Jupyter or any other IDE, you may have to simply import the cv2 package. To display the image, use cv2.imshow() function.

In this tutorial, we built autoencoder models using our own images. We also explored how to save the model. We loaded the saved model and made the predictions. We finally displayed the predicted images.

(Video) Deep Autoencoder in TensorFlow 2.0 (Keras) | Autoencoders Explained | Deep Learning

FAQs

What are autoencoders in keras? ›

Autoencoders are a class of Unsupervised Networks that consist of two major networks: Encoders and Decoders. An Unsupervised Network is a network that learns patterns from data without any training labels. The network finds its patterns in the data without being told what the patterns should be.

What is autoencoder for anomaly detection using Tensorflow keras? ›

Autoencoder is an unsupervised neural network model that uses reconstruction error to detect anomalies or outliers. The reconstruction error is the difference between the reconstructed data and the input data.

When should we not use autoencoders? ›

5. When should we not use autoencoders? An autoencoder could misclassify input errors that are different from those in the training set or changes in underlying relationships that a human would notice. Another drawback is you may eliminate the vital information in the input data.

Which Optimizer is best for autoencoder? ›

Fig 9 shows the training loss curves over 50 epochs. It can be seen that autoencoders with lookahead optimizer (with Adam) perform better compared to Adam (only) counterparts.

What is autoencoders in Tensorflow? ›

An autoencoder is a special type of neural network that is trained to copy its input to its output. For example, given an image of a handwritten digit, an autoencoder first encodes the image into a lower dimensional latent representation, then decodes the latent representation back to an image.

What is the purpose of using autoencoder in deep learning? ›

Autoencoders provide a useful way to greatly reduce the noise of input data, making the creation of deep learning models much more efficient. They can be used to detect anomalies, tackle unsupervised learning problems, and eliminate complexity within datasets.

What are the three 3 basic approaches to anomaly detection? ›

There are three main classes of anomaly detection techniques: unsupervised, semi-supervised, and supervised. Essentially, the correct anomaly detection method depends on the available labels in the dataset.

How can autoencoder improve accuracy? ›

By training the autoencoder on the training sample, it will try to minimize the reconstruction error and generate the encoder and decoder network weights. Later the decoder network can be cropped out and feature extraction embeddings can be generated using the encoder network.

Which algorithm is best for anomaly detection? ›

Local outlier factor (LOF)

Local outlier factor is probably the most common technique for anomaly detection. This algorithm is based on the concept of the local density. It compares the local density of an object with that of its neighbouring data points.

What is the disadvantage of autoencoders? ›

Disadvantage: Autoencoders are not that efficient compared to Generative Adversarial Networks in reconstructing an image. As the complexity of the images increase, autoencoders struggle to keep up and images start to get blurry.

How many layers should an autoencoder have? ›

In its simplest form, the autoencoder is a three layers net, i.e. a neural net with one hidden layer. The input and output are the same, and we learn how to reconstruct the input, for example using the adam optimizer and the mean squared error loss function.

Can autoencoders Overfit? ›

Abstract. Autoencoders (AE) aim to reproduce the output from the input. They may hence tend to overfit towards learning the identity-function between the input and output, i.e., they may predict each feature in the output from itself in the input.

What is bottleneck in autoencoder? ›

Bottleneck: It is the lower dimensional hidden layer where the encoding is produced. The bottleneck layer has a lower number of nodes and the number of nodes in the bottleneck layer also gives the dimension of the encoding of the input.

Do autoencoders need alot of data? ›

Autoencoders are an unsupervised technique that learns from its own data rather than labels created by humans. This often means that autoencoders need a considerable amount of clean data to generate useful results. They can deliver mixed results if the data set is not large enough, is not clean or is too noisy.

What is the accuracy of autoencoder in Python? ›

autoencoder: 99% accuracy without resampling.

What is a deep autoencoder? ›

Deep autoencoders: A deep autoencoder is composed of two symmetrical deep-belief networks having four to five shallow layers. One of the networks represents the encoding half of the net and the second network makes up the decoding half.

Can autoencoders be used for unsupervised learning? ›

An autoencoder neural network is an unsupervised learning algorithm that applies backpropagation, setting the target values to be equal to the inputs. I.e., it uses y(i)=x(i) .

How many layers are in deep autoencoder? ›

Deep learning methods for data classification

The major parts of the autoencoder network structure are the encoder function and the decoder function, which are used to reconstruct the data. The architecture of deep autoencoder consists of three layers: input, hidden, and output layer.

Which architecture is used in autoencoders? ›

An autoencoder is a neural network architecture capable of discovering structure within data in order to develop a compressed representation of the input.

What are the main tasks that autoencoders are used for? ›

An autoencoder is a type of artificial neural network used to learn efficient codings of unlabeled data (unsupervised learning). The encoding is validated and refined by attempting to regenerate the input from the encoding.

What is anomaly vs outlier detection? ›

Outliers are observations that are distant from the mean or location of a distribution. However, they don't necessarily represent abnormal behavior or behavior generated by a different process. On the other hand, anomalies are data patterns that are generated by different processes.

What is the best metric for anomaly detection? ›

Beyond accuracy, the most commonly used metrics when evaluating anomaly detection solutions are F1, Precision and Recall. One can think about these metrics in the following way: Recall is used to answer the question: What proportion of true anomalies was identified?

How PCA can be used for anomaly detection? ›

The PCA-Based Anomaly Detection component solves the problem by analyzing available features to determine what constitutes a "normal" class. The component then applies distance metrics to identify cases that represent anomalies. This approach lets you train a model by using existing imbalanced data.

How many epochs does it take to train an autoencoder? ›

The Convolutional Autoencoder

It contributes heavily in determining the learning parameters and affects the prediction accuracy. You will train your network for 50 epochs. As discussed before, the autoencoder is divided into two parts: there's an encoder and a decoder.

Why do autoencoders have a bottleneck layer? ›

A common belief in designing deep autoencoders (AEs), a type of unsu- pervised neural network, is that a bottleneck is required to prevent learning the identity function. Learning the identity function renders the AEs useless for anomaly detection.

Does autoencoder reduce dimensionality? ›

In simple words, autoencoders are specific type of deep learning architecture used for learning representation of data, typically for the purpose of dimensionality reduction. This is achieved by designing deep learning architecture that aims that copying input layer at its output layer.

Which deep learning algorithm is best for prediction? ›

Here is the list of top 10 most popular deep learning algorithms:
  • Convolutional Neural Networks (CNNs)
  • Long Short Term Memory Networks (LSTMs)
  • Recurrent Neural Networks (RNNs)
  • Generative Adversarial Networks (GANs)
  • Radial Basis Function Networks (RBFNs)
  • Multilayer Perceptrons (MLPs)
  • Self Organizing Maps (SOMs)

Which is the best unsupervised machine learning algorithms for anomaly detection? ›

Artificial neural network (ANNs) is probably the most popular algorithm to implement unsupervised anomaly detection. ANNs can be trained on large unlabeled datasets and, given the layered, non-linear learning, can be trusted to find intricate patterns to classify anomalies of a great variety.

Which algorithm is most robust to outliers? ›

Robust regression algorithms can be used for data with outliers in the input or target values.

What is autoencoder in CNN? ›

In an Autoencoder both Encoder and Decoder are made up of a combination of NN (Neural Networks) layers, which helps to reduce the size of the input image by recreating it. In the case of CNN Autoencoder, these layers are CNN layers (Convolutional, Max Pool, Flattening, etc.)

What is autoencoder in Python? ›

The idea of auto encoders is to allow a neural network to figure out how to best encode and decode certain data. The uses for autoencoders are really anything that you can think of where encoding could be useful. Some examples are in the form of compressing the number of input features and noise reduction.

How autoencoders are different from CNN? ›

Essentially, an autoencoder learns a clustering of the data. In contrast, the term CNN refers to a type of neural network which uses the convolution operator (often the 2D convolution when it is used for image processing tasks) to extract features from the data.

What is meant by an autoencoder? ›

An autoencoder is a type of artificial neural network used to learn efficient codings of unlabeled data (unsupervised learning). The encoding is validated and refined by attempting to regenerate the input from the encoding.

Videos

1. Convolutional Autoencoder in TensorFlow 2.0 (Keras) - Deep Learning
(Idiot Developer)
2. 13.1: Deep Neural Network Autoencoders in TensorFlow and Keras (Module 13, Part 1)
(Jeff Heaton)
3. Deeplearning implementation in Keras/Tensorflow for CNN/GAN/Autoencoder
(Arpan Gupta Data Scientist, IITian)
4. Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training | Edureka
(edureka!)
5. Autoencoder Dimensionality Reduction Python TensorFlow / Keras #CodeItQuick
(Greg Hogg)
6. Simple Autoencoder implementation in Keras | Autoencoders in Keras
(Knowledge Center)
Top Articles
Latest Posts
Article information

Author: Frankie Dare

Last Updated: 01/31/2023

Views: 6555

Rating: 4.2 / 5 (73 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Frankie Dare

Birthday: 2000-01-27

Address: Suite 313 45115 Caridad Freeway, Port Barabaraville, MS 66713

Phone: +3769542039359

Job: Sales Manager

Hobby: Baton twirling, Stand-up comedy, Leather crafting, Rugby, tabletop games, Jigsaw puzzles, Air sports

Introduction: My name is Frankie Dare, I am a funny, beautiful, proud, fair, pleasant, cheerful, enthusiastic person who loves writing and wants to share my knowledge and understanding with you.