Ubuntu 17.10 – Cuda 9.0, CUDNN 7.0.4, Tensorflow 1.4, Anaconda3-64

Versions listed are current as of this writing. Adjust accordingly.

Download Anaconda3-5.0.1

cd $HOME/Downloads
wget https://repo.continuum.io/archive/Anaconda3-5.0.1-Linux-x86_64.sh

Check Hash

md5sum Anaconda3-5.0.1-Linux-x86_64.sh

Output should be:

c989ecc8b648ab8a64731aaee9ed2e7e Anaconda3-5.0.1-Linux-x86_64.sh

Installing Anaconda

sudo sh Anaconda3-5.0.1-Linux-x86_64.sh

Download Cuda 9.0 runfile (do not try to install yet)

wget https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/cuda_9.0.176_384.81_linux-run

Check Hash

md5sum cuda_9.0.176_384.81_linux-run

Output should be:

7a00187b2ce5c5e350e68882f42dd507 cuda_9.0.176_384.81_linux-run

Stop your X-Server

Hit Ctrl-Alt-F2 and login

Kill the Display Manager

sudo service gdm stop

Go to Runlevel 3

sudo init 3

Disable Nouveau

echo -e 'blacklist nouveau\noptions nouveau modeset=0' > /etc/modprobe.d/blacklist-nouveau
sudo update-initramfs -u

Install Cuda 9.0 from the runfile

Choose Y to the Driver and X-Server questions. Remaining defaults should be fine.

cd ~\Downloads
sudo sh cuda_9.0.176_384.81_linux.run

Add environment variables

echo -e 'LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64' > /etc/ld.so.conf.d/cudalibs.conf
sudo ldconfig
echo -e '\nPATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}' >> ~/.profile

Reboot and login to the graphical interface

sudo reboot

Download cuDNN 7.0.4 files

You must log into your Nvidia developer account in your browser
* https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v7.0.4/prod/9.0_20171031/cudnn-9.0-linux-x64-v7
* https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v7.0.4/prod/9.0_20171031/Ubuntu16_04-x64/libcudnn7_7.0.4.31-1+cuda9.0_amd64
* https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v7.0.4/prod/9.0_20171031/Ubuntu16_04-x64/libcudnn7-dev_7.0.4.31-1+cuda9.0_amd64
* https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v7.0.4/prod/9.0_20171031/Ubuntu16_04-x64/libcudnn7-doc_7.0.4.31-1+cuda9.0_amd64

Check Each Hash

cd $HOME/Downloads
md5sum cudnn-9.0-linux-x64-v7.tgz && \
md5sum libcudnn7_7.0.4.31-1+cuda9.0_amd64.deb && \
md5sum libcudnn7-dev_7.0.4.31-1+cuda9.0_amd64.deb && \
md5sum libcudnn7-doc_7.0.4.31-1+cuda9.0_amd64.deb

Output should be:

fc8a03ac9380d582e949444c7a18fb8d cudnn-9.0-linux-x64-v7.tgz
e986f9a85fd199ab8934b8e4835496e2 libcudnn7_7.0.4.31-1+cuda9.0_amd64.deb
4bd528115e3dc578ce8fca0d32ab82b8 libcudnn7-dev_7.0.4.31-1+cuda9.0_amd64.deb
04ad839c937362a551eb2170afb88320 libcudnn7-doc_7.0.4.31-1+cuda9.0_amd64.deb

Install cuDNN 7.0.4 and libraries

tar -xzvf cudnn-9.0-linux-x64-v7.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
sudo dpkg -i libcudnn7_7.0.4.31-1+cuda9.0_amd64.deb
sudo dpkg -i libcudnn7-dev_7.0.4.31-1+cuda9.0_amd64.deb
sudo dpkg -i libcudnn7-doc_7.0.4.31-1+cuda9.0_amd64.deb

Verifying cuDNN

Ubuntu 17.10 includes version 7+ of the GNU compilers
CUDA is not compatible with higher than version 6
The error returned is:

error — unsupported GNU version! gcc versions later than 6 are not supported!

Fix – Install Version 6 and create symbolic links in CUDA bin directory:

sudo apt-get install gcc-6 g++-6
sudo ln -sf /usr/bin/gcc-6 /usr/local/cuda/bin/gcc
sudo ln -sf /usr/bin/g++-6 /usr/local/cuda/bin/g++

Now build mnistCUDNN to test cuDNN

cp -r /usr/src/cudnn_samples_v7/ $HOME
cd $HOME/cudnn_samples_v7/mnistCUDNN
make clean && make
./mnistCUDNN

If cuDNN is properly installed, you will see:

Test passed!

Some additional prerequisites (many should already be installed)

sudo apt-get install g++ freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libglu1-mesa libglu1-mesa-dev
sudo apt-get install python3-numpy python3-dev python3-pip python3-wheel python3-virtualenv libcupti-dev

Install Tensorflow 1.4 from Source.

  • As of this writing, this is the only way it will work with CUDA 9.0 and cuDNN 7.0
  • Instructions: https://www.tensorflow.org/install/install_sources
  • Some of the instructions may not make sense, here is how I did it:
cd $HOME/Downloads
git clone https://github.com/tensorflow/tensorflow
cd tensorflow
git checkout r1.4
./configure

The sample output and options will differ from that in the instructions
* Make sure you configure for CUDA version: 9.0
* Make sure you configure for cuDNN version: 7.0.4
* Make sure you know your compute capability from https://developer.nvidia.com/cuda-gpus
* I set this to 6.1 as I have a GeForce GTX 1070

Installing Bazel

sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update && sudo apt-get install oracle-java8-installer
echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
sudo apt-get update && sudo apt-get install bazel
sudo /sbin/ldconfig -v

Building TensorFlow

bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
pip install /tmp/tensorflow_pkg/tensorflow-1.4.0-cp36-cp36m-linux_x86_64.whl

Testing TensorFlow

python

My output looks like this:

Python 3.6.3 |Anaconda, Inc.| (default, Oct 13 2017, 12:02:49)
[GCC 7.2.0] on linux
Type “help”, “copyright”, “credits” or “license” for more information.

>

Type or Paste the following:

import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()

The results vary by hardware:

2017-11-06 19:36:14.235987: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.797
pciBusID: 0000:01:00.0
totalMemory: 7.92GiB freeMemory: 7.29GiB
2017-11-06 19:36:14.236030: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)

Type or Paste the following:

print(sess.run(hello))

This results in:

b'Hello, TensorFlow!'

TensorFlow 1.4 is running with CUDA 9.0 and cuDNN 7.0.4