Multi-Threaded UPC Runtime for GPU to GPU communication over InfiniBand
M. Luo, H. Wang, D. Panda
International Conference on Partitioned Global Address Space Programming Models (PGAS '12),
Oct 2012.