S-Caffe: Co-designing MPI Runtimes and Caffe for Scalable Deep Learning on Modern GPU Clusters
A. Awan, K. Hamidouche, J. Hashmi, D. Panda
22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming,
Feb 2017.