Designing Coalescing Network-on-Chip for Efficient Memory Accesses of GPGPUs

Document Type

Conference Proceeding

Publication Date


Publication Title

Network and Parallel Computing


The massive multithreading architecture of General Purpose Graphic Processors Units (GPGPU) makes them ideal for data parallel computing. However, designing efficient GPGPU chips poses many challenges. One major hurdle is the interface to the external DRAM, particularly the buffers in the memory controllers (MCs), which is stressed heavily by the many concurrent memory accesses from the GPGPU. Previous approaches considered scheduling the memory requests in the memory buffers to reduce switching of memory rows. The problem is that the window of requests that can be considered for scheduling is too narrow and the memory controller is very complex, affecting the critical path. In view of the massive multithreading architecture of GPGPUs that can hide memory access latencies, we exploit in this paper the novel idea of rearranging the memory requests in the network-on-chip (NoC), called packet coalescing. To study the feasibility of this idea, we have designed an expanded NoC router that supports packet coalescing and evaluated its performance extensively. Evaluation results show that this NoC-assisted design strategy can improve the row buffer hit rate in the memory controllers. A comprehensive investigation of factors affecting the performance of coalescing is also conducted and reported.


11th IFIP WG 10.3 International Conference, NPC 2014, Ilan, Taiwan, September 18-20, 2014. Proceedings

Original Citation

C. Chen, Y. S. Huang, Y. Chang, C. Tu, C. King, T. Wang, J. Sang and M. Li, "Designing coalescing network-on-chip for efficient memory accesses of GPGPUs," in Network and Parallel Computing: 11th IFIP WG 10.3 International Conference, NPC 2014, Ilan, Taiwan, September 18-20, 2014. Proceedings, C. Hsu, X. Shi and V. Salapura, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2014, pp. 169-180.