Building deep neural networks just got easier. TensorFlow has announced that they are incorporating the popular deep learning API, Keras, as part of the core code that ships with TensorFlow 1.2. In the words of Keras’ author François Chollet, “Theano and TensorFlow are closer to NumPy, while Keras is closer to scikit-learn,” which is to say that Keras is at a higher level compared to pure TensorFlow and makes building deep learning models much more manageable.

TensorFlow is one of the fastest, most flexible, and most scalable machine-learning libraries available. It was developed internally by Google Brain and released as an open-source library in November 2015. Almost immediately upon its release, TensorFlow became one of the most popular machine learning libraries. But, as is the case with many libraries that emphasize speed and flexibility, TensorFlow tends to be a bit low-level.

Google first began development of TensorFlow (initially called DistBelief) with the aim to make a deep learning platform in which the same code could be used from development to production. The result is that TensorFlow can be run on both CPUs and GPUs, with the latter yielding massive reductions in computational time. This is in contrast to other machine learning platforms such as scikit-learn, which do not have GPU support. Scikit-learn added neural network support in September 2016 with a multi-layer perceptron model (sklearn.neural_network.MLPClassifier), which is at an even higher-level than Keras, but doesn’t offer the flexibility or speed as Keras/TensorFlow.

Keras was first released in March 2015 by François Chollet as an open-source, high-level neural network API written in Python. Keras interfaces with Theano or TensorFlow, and has grown significantly in popularity, now with over 100k active monthly users. Keras will now have two implementations: one written entirely in TensorFlow available as tf.keras, and the other separate codebase which supports both Theano and TensorFlow, and possibly other backends in the future.

Let’s look at how Keras makes building a multilayer convolutional neural network (CNN) significantly easier than pure TensorFlow:

In Keras, to build a multilayer CNN for image recognition of the MNIST numbers, you add the different layers together as follows (snippet from Keras example mnist_cnn.py):

**# Import Keras classes**from keras.models import Sequential from keras.layers import Conv2D, MaxPooling2D, Dropout, Flatten, Dense

**# Build convolution neural network**model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape)) model.add(Conv2D(64, (3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(num_classes, activation='softmax'))

**# Compile model**model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adadelta(), metrics=['accuracy'])

**# Fit model**model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_data=(x_test, y_test)) score = model.evaluate(x_test, y_test, verbose=0) print('Test loss:', score[0]) print('Test accuracy:', score[1])

**# Done**

As you can see, the Keras API makes it very easy to add or remove layers with one line using the model.add() functionality. For those familiar with scikit-learn, you’ll recognize a familiar syntax in Keras such as model.fit(x_train, y_train), making it easier to pick up.

In contrast, building a similar multilayer CNN using pure TensorFlow is far more verbose. From the TensorFlow tutorial, we have the following snippet for building a deep convolutional MNIST classifier:

**# Create utility functions**def weight_variable(shape): initial = tf.truncated_normal(shape, stddev=0.1) return tf.Variable(initial) def bias_variable(shape): initial = tf.constant(0.1, shape=shape) return tf.Variable(initial) def conv2d(x, W): return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME') def max_pool_2x2(x): return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

**# First convolutional layer**W_conv1 = weight_variable([5, 5, 1, 32]) b_conv1 = bias_variable([32]) x_image = tf.reshape(x, [-1,28,28,1]) h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) h_pool1 = max_pool_2x2(h_conv1)

**# Second convolutional layer**W_conv2 = weight_variable([5, 5, 32, 64]) b_conv2 = bias_variable([64]) h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) h_pool2 = max_pool_2x2(h_conv2)

**# Dense layer**W_fc1 = weight_variable([7 * 7 * 64, 1024]) b_fc1 = bias_variable([1024]) h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64]) h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

**# Dropout layer**keep_prob = tf.placeholder(tf.float32) h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

**# Readout layer**W_fc2 = weight_variable([1024, 10]) b_fc2 = bias_variable([10]) y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2

**# Train and evaluate**cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv)) train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) sess.run(tf.global_variables_initializer()) for i in range(20000): batch = mnist.train.next_batch(50) if i%100 == 0: train_accuracy = accuracy.eval(feed_dict={ x:batch[0], y_: batch[1], keep_prob: 1.0}) print("step %d, training accuracy %g"%(i, train_accuracy)) train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) print("test accuracy %g"%accuracy.eval(feed_dict={ x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))

In conclusion, the integration of Keras into core TensorFlow has synergy that lowers the barrier to building powerful deep learning models that are fast and flexible. Although there are a number of other tools for building neural networks such as Theano, Caffe, Deeplearning4j, Torch, and more recently scikit-learn, the marriage of Keras with Tensorflow makes it a great place to start.

*Visit our website to learn more about our offerings:*

- Data Science Fellowship – a free, full-time, eight-week bootcamp program for PhD and master’s graduates looking to get hired as professional Data Scientists in New York City, Washington DC, San Francisco, and Boston.
- Hiring Data Scientists
- Corporate data science training
**Online data science courses:**introductory part-time bootcamps – taught by our expert Data Scientists in residence, and based on our Fellowship curriculum – for busy professionals to boost their data science skills in their spare time.