Table of Contents
Machine Learning can be computationally demanding. Training Deep Neural Networks with large data sets benefits greatly from using GPU accelerated frameworks like Caffe or Tensorflow. In fact the availability these GPU accelerated frameworks has been a driving factor in the resurgence of interest and development of methods using artificial neural networks. A new video card is always worth testing especially when it offers the incredible performance and value of NVIDIA’s new GTX1080Ti.
I loaded up a Docker image for nvidia/digits and put my test-bench version of our DIGITS/Caffe system under (heavy) load doing a large image classification job to see how the GTX 1080Ti and Titan X Pascal compare. I also compiled and ran my favorite simple GPU compute test job — the CUDA nbody program.
The GTX 1080Ti performed essentially equivalent to the Titan X for both the simple nbody job and the large 20 hour classification job run with Caffe on a million image training set!
Results
I’ll give you the results first and then explain what was done after that. (… since I know you really just want to see how fast the GTX 1080Ti is 🙂
The following table lists model training times for 30 epochs using single GTX 1080Ti and Titan X Pascal cards.
GoogLeNet model training with Caffe on 1.3 million image dataset for 30 epochs using GTX 1080Ti and Titan X video cards
GPU | Model training runtime |
---|---|
(1) GTX 1080Ti | 19hr 43min |
(1) Titan X | 20hr 7min |
- Notes:
- Variation in job run time is normal so I consider these two results to be effectively identical
- Job runs were done with an image batch size of 64
- GPU memory usage was approx. 8GB
The next table shows the results of nbody -benchmark -numbodies=256000
on the GTX 1080Ti and Titan X Pascal.
GTX 1080Ti and Titan X nbody Benchmark
GPU | nbody GFLOP/s |
---|---|
(1) GTX 1080Ti | 7514 GFLOP/s |
(1) Titan X | 7524 GFLOP/s |
Details
Test System
Test System
I did the testing on my test-bench layout of our Peak Single (DIGITS GPU Workstation) recommended system for DIGITS/Caffe.
- The Peak Single (“DIGITS” GPU Workstation)
- CPU: Intel Core i7 6850K 6-core @ 3.6GHz (3.7GHz All-Core-Turbo)
- Memory: 128 GB DDR4 2133MHz Reg ECC
- PCIe: (4) X16-X16 v3
- Motherboard: ASUS X99-E-10G WS
- GPU’s
- NVIDIA GTX 1080Ti
- NVIDIA Titan X Pascal
Note: We normally spec Xeon processors in our Peak line of systems but I had a 6850K on my test-bench so I used it.
I forgot to run nvida-smi -a
to pull all of the “real” specs off of the 1080Ti while I had the card. I’ll add those in a comment the next time I have access to the card.
Caveat:
Heavy compute on GeForce cards can shorten their lifetime! I believe it is perfectly fine to use these cards but keep in mind that you may fry one now and then!
Software
The OS I used for this testing was Ubuntu 16.04.2 install with the Docker and NVIDIA-Docker Workstation configuration I’ve been working on. See, these posts for information about that;
- Docker and NVIDIA-docker on your workstation: Motivation
- Docker and NVIDIA-docker on your workstation: Installation
- Docker and NVIDIA-docker on your workstation: Setup User Namespaces
Following is a list of the software in the nvidia/digits Docker image used in the testing.
- Ubuntu 14.04
- CUDA 8.0.61
- DIGITS 5.0.0
- caffe-nv (0.15.13-3ubuntu14.04+cuda8.0), cuDNN 5
Host environment was,
- Ubuntu 16.04
- Docker version 17.03.0-ce
- NVIDIA-Docker version 1.0.1
- NVIDIA display driver 375.39
- Training set from Imagenet ILSVRC2012
Test job image dataset
I used the training image set from
IMAGENET Large Scale Visual Recognition Challenge 2012 (ILSVRC2012)
I only used the the training set images from the “challenge”. All 138GB of them! I used the tools in DIGITS to partition this set into a training set and validation set and then used the GoogLeNet 22-layer network.
- Training set — 960893 images
- Validation set — 320274 images
- Model — GoogLeNet
- Duration — 30 Epochs
Many of the images in the IMAGENET collection are copyrighted. This means that usage and distribution is somewhat restricted. One of the things listed in the conditions for download is this,
“You will NOT distribute the above URL(s)”
So, I wont. Please see the IMAGENET site for information on obtaining datasets.
Citation
-
Olga Russakovsky*, Jia Deng*, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg and Li Fei-Fei. (* = equal contribution) ImageNet Large Scale Visual Recognition Challenge. arXiv:1409.0575, 2014.
paper |
bibtex
Conclusions
Yes! The NVIDIA GTX 1080Ti is a great card for GPU accelerated machine learning workloads! The only thing I don’t know is how well it will hold up under sustained heavy load. Only time will tell for that. In any case, at this point, I would highly recommend this video card for compute.
Happy computing –dbk