Tensorflow with Keras – Empowering Neural Networks for Deep Learning

Building deep neural networks just got easier. TensorFlow has announced that they are incorporating the popular deep learning API, Keras, as part of the core code that ships with TensorFlow 1.2. In the words of Keras’ author François Chollet, “Theano and TensorFlow are closer to NumPy, while Keras is closer to scikit-learn,” which is to say that Keras is at a higher level compared to pure TensorFlow and makes building deep learning models much more manageable.

TensorFlow is one of the fastest, most flexible, and most scalable machine-learning libraries available. It was developed internally by Google Brain and released as an open-source library in November 2015. Almost immediately upon its release, TensorFlow became one of the most popular machine learning libraries. But, as is the case with many libraries that emphasize speed and flexibility, TensorFlow tends to be a bit low-level.

Google first began development of TensorFlow (initially called DistBelief) with the aim to make a deep learning platform in which the same code could be used from development to production. The result is that TensorFlow can be run on both CPUs and GPUs, with the latter yielding massive reductions in computational time. This is in contrast to other machine learning platforms such as scikit-learn, which do not have GPU support. Scikit-learn added neural network support in September 2016 with a multi-layer perceptron model (sklearn.neural_network.MLPClassifier), which is at an even higher-level than Keras, but doesn’t offer the flexibility or speed as Keras/TensorFlow.

Keras was first released in March 2015 by François Chollet as an open-source, high-level neural network API written in Python. Keras interfaces with Theano or TensorFlow, and has grown significantly in popularity, now with over 100k active monthly users. Keras will now have two implementations: one written entirely in TensorFlow available as tf.keras, and the other separate codebase which supports both Theano and TensorFlow, and possibly other backends in the future.

Let’s look at how Keras makes building a multilayer convolutional neural network (CNN) significantly easier than pure TensorFlow:

In Keras, to build a multilayer CNN for image recognition of the MNIST numbers, you add the different layers together as follows (snippet from Keras example mnist_cnn.py):

# Import Keras classes

    • from keras.models import Sequential from keras.layers import Conv2D, MaxPooling2D, Dropout, Flatten, Dense

# Build convolution neural network

    • model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3), activation=’relu’, input_shape=input_shape)) model.add(Conv2D(64, (3, 3), activation=’relu’)) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) model.add(Flatten()) model.add(Dense(128, activation=’relu’)) model.add(Dropout(0.5)) model.add(Dense(num_classes, activation=’softmax’))

# Compile model

    • model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adadelta(), metrics=[‘accuracy’])

# Fit model

    • model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_data=(x_test, y_test)) score = model.evaluate(x_test, y_test, verbose=0) print(‘Test loss:’, score[0]) print(‘Test accuracy:’, score[1])

# Done

As you can see, the Keras API makes it very easy to add or remove layers with one line using the model.add() functionality. For those familiar with scikit-learn, you’ll recognize a familiar syntax in Keras such as model.fit(x_train, y_train), making it easier to pick up.

In contrast, building a similar multilayer CNN using pure TensorFlow is far more verbose. From the TensorFlow tutorial, we have the following snippet for building a deep convolutional MNIST classifier:

# Create utility functions

    • def weight_variable(shape): initial = tf.truncated_normal(shape, stddev=0.1) return tf.Variable(initial) def bias_variable(shape): initial = tf.constant(0.1, shape=shape) return tf.Variable(initial) def conv2d(x, W): return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding=’SAME’) def max_pool_2x2(x): return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding=’SAME’)

# First convolutional layer

    • W_conv1 = weight_variable([5, 5, 1, 32]) b_conv1 = bias_variable([32]) x_image = tf.reshape(x, [-1,28,28,1]) h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) h_pool1 = max_pool_2x2(h_conv1)

# Second convolutional layer

    • W_conv2 = weight_variable([5, 5, 32, 64]) b_conv2 = bias_variable([64]) h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) h_pool2 = max_pool_2x2(h_conv2)

# Dense layer

    • W_fc1 = weight_variable([7 * 7 * 64, 1024]) b_fc1 = bias_variable([1024]) h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64]) h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

# Dropout layer

    • keep_prob = tf.placeholder(tf.float32) h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

# Readout layer

    • W_fc2 = weight_variable([1024, 10]) b_fc2 = bias_variable([10]) y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2

# Train and evaluate

    cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv)) train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) sess.run(tf.global_variables_initializer()) for i in range(20000): batch = mnist.train.next_batch(50) if i%100 == 0: train_accuracy = accuracy.eval(feed_dict={ x:batch[0], y_: batch[1], keep_prob: 1.0}) print(“step %d, training accuracy %g”%(i, train_accuracy)) train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) print(“test accuracy %g”%accuracy.eval(feed_dict={ x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))

In conclusion, the integration of Keras into core TensorFlow has synergy that lowers the barrier to building powerful deep learning models that are fast and flexible. Although there are a number of other tools for building neural networks such as Theano, Caffe, Deeplearning4j, Torch, and more recently scikit-learn, the marriage of Keras with Tensorflow makes it a great place to start.

Related Blog Posts

data science portfolio

How to Build a Strong Data Science Portfolio: 5-Step Guide

So you want to be a data scientist? Great choice! Data scientists are still the hottest jobs around. But before you can start applying for data science jobs, you need to build a strong data science portfolio. A data science portfolio is a collection of your best data science projects that demonstrate your skills and abilities.

In this blog post, I’ll provide a 5-step guide on how to build a strong data science portfolio.

Read More »
imposter syndrome

Impostor Syndrome in Tech: What It Is, Why It Exists, and How to Overcome It

Impostor syndrome isn’t experienced in just certain industries or disciplines or only by certain individuals. It’s much more widespread than you may think. If you’re in the technology field, you may be familiar with this sentiment, but maybe you’ve never heard the term impostor syndrome. So, what exactly is impostor syndrome? What causes it? And how do people in data science, the tech field or STEM industries overcome it?

Read More »