Wide-issue processors issuing tens of instructions per cycle, put heavy stress on the memory system, including data caches. For wide-issue architectures, the data cache needs to be heavily multi-ported with extremely wide data-paths. This paper studies a scalable solution to achieve multi-porting with short data-paths and less hardware complexity at higher clock-rates. Our approach divides memory streams into multiple independent sub-streams with the help of a prediction mechanism before they enter the reservation stations. Partitioned memory-reference instructions are then fed into separate memory pipelines, each of which is connected to a small data-cache, called access region cache. The separation of independent memory references, in an ideal situation, facilitates the use of multiple caches with smaller number of ports and thus increases the data-bandwidth. We describe and evaluate a wide-issue processor with distinct memory pipelines, driven by a prediction mechanism. The potential performance of the proposed design is measured by comparing it with existing multi-porting solutions as well as an ideal multi-ported data cache.
ASJC Scopus subject areas
- Hardware and Architecture
- Computer Networks and Communications
- Artificial Intelligence