Monday, June 4, 2018

How to Setup TensorFlow GPU for accelerated Machine Learning and Cloud Portability

Setting up TensorFlow is as simple as issuing: pip3 install tensorflow. This setup only uses your CPU to perform the calculations.

To enable GPU support requires installing Nvidia drivers, CUDA toolkit, and cuDNN libraries and potential of spending hours troubleshooting installation dependencies.

But as Andriy Lazorenko demonstrates with a common laptop setup using a low-end Geforce MX945 GPU can provide three times the performance over the Intel Core i7 7500U CPU. If you have access to higher end GPUs you can realistically get 15x or more performance gains.

That's the difference between waiting for an hour, or under 4 minutes to complete the same task!

While it's possible to install everything locally on your computer. I recommend using Docker and building a container for each Tensorflow project you create for these reasons:

1) Cloud Portability Ready

Ultimately, there may come a time that you will need more powerful hardware. Working with Docker containers ensure easy portability to a more powerful computer, a cluster of computer or the cloud. Your Tensorflow environment and dependencies are contained in the container.

2) Eliminates the tight coupling to hardware and drivers

Having gone thru the setup of the CUDA toolkit installation I found that the toolkit relies on a specific version of Nvidia Drivers being installed on the host machine, this can lead to countless hours debugging dependencies. If you chose to run everything in a container the host machine simply needs a recent Nvidia driver instead of a specific version for each CUDA toolkit release.

Windows 10 is a no-go

Although TensorFlow GPU does support Windows 10, there are two important reasons why Windows 10 is a no-go for TensorFlow GPU  at the moment. They are:

1) Docker does not have GPU support

GPU support for Docker in Windows 10 is not supported. You much chose between performance using TensorFlow GPU with Python VirtualEnv or better portability using Docker but without GPU support.

2) The Windows Subsystem for Linux (WSL) does not have GPU support

You can only use the Command Prompt or Powershell, the Linux bash in Windows 10 does not support the use of GPU. Upvote here to encourage Microsoft to change this.

Ubuntu Linux it is!

You will need to have drivers newer than 387.26. To check the version of your Nvidia driver running on your system:

Install Nvidia Drivers Newer Than 387.26

If you need to install new Nvidia drivers simply:
1) Uninstall old Nvidia drivers
2) Ensure you have the graphics-driver PPA repository
3) Update APT repository
4) Install Nvidia Drivers

sudo apt-get remove --purge nvidia-* -y
sudo apt-get remove --purge libnvidia-* -y
sudo add-apt-repository ppa:graphics-drivers -y
apt-get update -y
sudo apt-get install nvidia-390 -y

Install Docker CE

Test if it works by:
docker run -it hello-world
If you get:
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.30/info: dial unix /var/run/docker.sock: connect: permission denied
Fix it by adding your current user to the docker group:
sudo usermod -a -G docker $USER

Install Nvidia Docker 2.0

1) Install nvidia-docker repository
2) Update local repository meta-data
3) Install nvidia-docker2
4) Reset docker

curl -s -L | sudo apt-key add -distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update

sudo apt-get install nvidia-docker2
sudo pkill -SIGHUP dockerd

Create a Docker Container

Everything is set up! It's time to create a Docker container that can utilize your GPU.

Tensorflow has a public repository of docker images:

The Tensorflow Docker Images are built based on Nvidia's CUDA Docker Images found here:

For example, the stable release of Tensorflow 1.8.0-devel-gpu-py3 and is based on Nvidia's: nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04.

To download, create, and access the console of the container:
docker run --name tensorcontainer --runtime=nvidia -it tensorflow/tensorflow:1.8.0-devel-gpu-py3
To leave the container:
To connect back the container:
docker run -ai tensorcontainer

Testing TensorFlow in your Container

At this point, you will be logged into the console of a Docker container named tensorcontainer, that uses the Nvidia GPU using an Ubuntu 16.04 OS with full CUDA toolkit and cuDNN! 

Create a test Python script with:

from tensorflow.python.client import device_lib

echo "from tensorflow.python.client import device_lib" >>
echo "device_lib.list_local_devices()" >>

Run the script:
The output should be similar to this:
Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
Found device 0 with properties:
name: GeForce GTX 1050 major: 6 minor: 1 memoryClockRate(GHz): 1.493
pciBusID: 0000:01:00.0
totalMemory: 1.96GiB freeMemory: 1.84GiB
Adding visible gpu devices: 0
Device interconnect StreamExecutor with strength 1 edge matrix:
 0:   N
Created TensorFlow device (/device:GPU:0 with 1600 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)

There you have it! You now have a computer that can use a Nvidia GPU to decrease the time required to train your neural network, and because your whole TensorFlow environment and code is contained within a Docker container you can easily migrate to more powerful computer or a cluster of computer or cloud services such as Amazon ECS, Azure Kubernetes Service or Google Cloud Kubernetes Engine.

1 comment:

  1. The development of artificial intelligence (AI) has propelled more programming architects, information scientists, and different experts to investigate the plausibility of a vocation in machine learning. Notwithstanding, a few newcomers will in general spotlight a lot on hypothesis and insufficient on commonsense application. machine learning projects for final year In case you will succeed, you have to begin building machine learning projects in the near future.

    Projects assist you with improving your applied ML skills rapidly while allowing you to investigate an intriguing point. Furthermore, you can include projects into your portfolio, making it simpler to get a vocation, discover cool profession openings, and Final Year Project Centers in Chennai even arrange a more significant compensation.

    Data analytics is the study of dissecting crude data so as to make decisions about that data. Data analytics advances and procedures are generally utilized in business ventures to empower associations to settle on progressively Python Training in Chennai educated business choices. In the present worldwide commercial center, it isn't sufficient to assemble data and do the math; you should realize how to apply that data to genuine situations such that will affect conduct. In the program you will initially gain proficiency with the specialized skills, including R and Python dialects most usually utilized in data analytics programming and usage; Python Training in Chennai at that point center around the commonsense application, in view of genuine business issues in a scope of industry segments, for example, wellbeing, promoting and account.