Offloaded GPU Collectives using CORE-Direct and CUDA Capabilities on IB Clusters 
A. Venkatesh, K. Hamidouche, H. Subramoni, D. Panda
22nd IEEE International Conference  on High Performance Computing,
Dec 2015.