An OpenMP Compiler for Efficient Use of Distributed Scratchpad Memory in MPSoCs

Marongiu A., Benini L.
(Accepted for publication) IEEE Transactions on Computers, 2010.
Most of today’s state-of-the-art processors for mobile and embedded systems feature on-chip scratchpad memories. To efficiently exploit the advantages of low-latency high-bandwidth memory modules in the hierarchy there is the need for programming models and/or language features that expose such architectural details. On the other hand, effectively exploiting the limited on-chip memory space requires the programmer to devise an efficient partitioning and distributed placement of shared data at the application level. In this paper we propose a programming framework that combines the ease of use of OpenMP with simple yet powerful language extensions to trigger array data partitioning. Our compiler exploits profiled information on array access count to automatically generate data allocation schemes optimized for locality of references.