Current Search: memory hierarchy (x)
View All Items
- Title
- ANALYZING INSTRUCTTION BASED CACHE REPLACEMENT POLICIES.
- Creator
-
Xiang, Ping, Zhou, Huiyang, University of Central Florida
- Abstract / Description
-
The increasing speed gap between microprocessors and off-chip DRAM makes last-level caches (LLCs) a critical component for computer performance. Multi core processors aggravate the problem since multiple processor cores compete for the LLC. As a result, LLCs typically consume a significant amount of the die area and effective utilization of LLCs is mandatory for both performance and power efficiency. We present a novel replacement policy for last-level caches (LLCs). The fundamental...
Show moreThe increasing speed gap between microprocessors and off-chip DRAM makes last-level caches (LLCs) a critical component for computer performance. Multi core processors aggravate the problem since multiple processor cores compete for the LLC. As a result, LLCs typically consume a significant amount of the die area and effective utilization of LLCs is mandatory for both performance and power efficiency. We present a novel replacement policy for last-level caches (LLCs). The fundamental observation is to view LLCs as a shared resource among multiple address streams with each stream being generated by a static memory access instruction. The management of LLCs in both single-core and multi-core processors can then be modeled as a competition among multiple instructions. In our proposed scheme, we prioritize those instructions based on the number of LLC accesses and reuses and only allow cache lines having high instruction priorities to replace those of low priorities. The hardware support for our proposed replacement policy is light-weighted. Our experimental results based on a set of SPEC 2006 benchmarks show that it achieves significant performance improvement upon the least-recently used (LRU) replacement policy for benchmarks with high numbers of LLC misses. To handle LRU-friendly workloads, the set sampling technique is adopted to retain the benefits from the LRU replacement policy.
Show less - Date Issued
- 2010
- Identifier
- CFE0003377, ucf:48481
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0003377
- Title
- Energy-Aware Data Movement In Non-Volatile Memory Hierarchies.
- Creator
-
Khoshavi Najafabadi, Navid, DeMara, Ronald, Yuan, Jiann-Shiun, Song, Zixia, University of Central Florida
- Abstract / Description
-
While technology scaling enables increased density for memory cells, the intrinsic high leakagepower of conventional CMOS technology and the demand for reduced energy consumption inspiresthe use of emerging technology alternatives such as eDRAM and Non-Volatile Memory (NVM) including STT-MRAM, PCM, and RRAM. The utilization of emerging technology in Last Level Cache (LLC) designs which occupies a signi?cant fraction of total die area in Chip Multi Processors (CMPs) introduces new dimensions...
Show moreWhile technology scaling enables increased density for memory cells, the intrinsic high leakagepower of conventional CMOS technology and the demand for reduced energy consumption inspiresthe use of emerging technology alternatives such as eDRAM and Non-Volatile Memory (NVM) including STT-MRAM, PCM, and RRAM. The utilization of emerging technology in Last Level Cache (LLC) designs which occupies a signi?cant fraction of total die area in Chip Multi Processors (CMPs) introduces new dimensions of vulnerability, energy consumption, and performance delivery. To be speci?c, a part of this research focuses on eDRAM Bit Upset Vulnerability Factor (BUVF) to assess vulnerable portion of the eDRAM refresh cycle where the critical charge varies depending on the write voltage, storage and bit-line capacitance. This dissertation broaden the study on vulnerability assessment of LLC through investigating the impact of Process Variations (PV) on narrow resistive sensing margins in high-density NVM arrays, including on-chip cache and primary memory. Large-latency and power-hungry Sense Ampli?ers (SAs) have been adapted to combat PV in the past. Herein, a novel approach is proposed to leverage the PV in NVM arrays using Self-Organized Sub-bank (SOS) design. SOS engages the preferred SA alternative based on the intrinsic as-built behavior of the resistive sensing timing margin to reduce the latency and power consumption while maintaining acceptable access time.On the other hand, this dissertation investigates a novel technique to prioritize the service to 1)Extensive Read Reused Accessed blocks of the LLC that are silently dropped from higher levelsof cache, and 2) the portion of the working set that may exhibit distant re-reference interval in L2. In particular, we develop a lightweight Multi-level Access History Pro?ler to ef?ciently identifyERRA blocks through aggregating the LLC block addresses tagged with identical Most Signi?cantBits into a single entry. Experimental results indicate that the proposed technique can reduce theL2 read miss ratio by 51.7% on average across PARSEC and SPEC2006 workloads.In addition, this dissertation will broaden and apply advancements in theories of subspace recoveryto pioneer computationally-aware in-situ operand reconstruction via the novel Logic In Intercon-nect (LI2) scheme. LI2 will be developed, validated, and re?ned both theoretically and experimentally to realize a radically different approach to post-Moore's Law computing by leveraginglow-rank matrices features offering data reconstruction instead of fetching data from main memory to reduce energy/latency cost per data movement. We propose LI2 enhancement to attain highperformance delivery in the post-Moore's Law era through equipping the contemporary micro-architecture design with a customized memory controller which orchestrates the memory requestfor fetching low-rank matrices to customized Fine Grain Recon?gurable Accelerator (FGRA) forreconstruction while the other memory requests are serviced as before. The goal of LI2 is to conquer the high latency/energy required to traverse main memory arrays in the case of LLC miss, by using in-situ construction of the requested data dealing with low-rank matrices. Thus, LI2 exchanges a high volume of data transfers with a novel lightweight reconstruction method under speci?c conditions using a cross-layer hardware/algorithm approach.
Show less - Date Issued
- 2017
- Identifier
- CFE0006754, ucf:51859
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006754
- Title
- Towards High-Efficiency Data Management In the Next-Generation Persistent Memory System.
- Creator
-
Chen, Xunchao, Wang, Jun, Fan, Deliang, Lin, Mingjie, Ewetz, Rickard, Zhang, Shaojie, University of Central Florida
- Abstract / Description
-
For the sake of higher cell density while achieving near-zero standby power, recent research progress in Magnetic Tunneling Junction (MTJ) devices has leveraged Multi-Level Cell (MLC) configurations of Spin-Transfer Torque Random Access Memory (STT-RAM). However, in order to mitigate the write disturbance in an MLC strategy, data stored in the soft bit must be restored back immediately after the hard bit switching is completed. Furthermore, as the result of MTJ feature size scaling, the soft...
Show moreFor the sake of higher cell density while achieving near-zero standby power, recent research progress in Magnetic Tunneling Junction (MTJ) devices has leveraged Multi-Level Cell (MLC) configurations of Spin-Transfer Torque Random Access Memory (STT-RAM). However, in order to mitigate the write disturbance in an MLC strategy, data stored in the soft bit must be restored back immediately after the hard bit switching is completed. Furthermore, as the result of MTJ feature size scaling, the soft bit can be expected to become disturbed by the read sensing current, thus requiring an immediate restore operation to ensure the data reliability. In this paper, we design and analyze a novel Adaptive Restore Scheme for Write Disturbance (ARS-WD) and Read Disturbance (ARS-RD), respectively. ARS-WD alleviates restoration overhead by intentionally overwriting soft bit lines which are less likely to be read. ARS-RD, on the other hand, aggregates the potential writes and restore the soft bit line at the time of its eviction from higher level cache. Both of these two schemes are based on a lightweight forecasting approach for the future read behavior of the cache block. Our experimental results show substantial reduction in soft bit line restore operations. Moreover, ARS promotes advantages of MLC to provide a preferable L2 design alternative in terms of energy, area and latency product compared to SLC STT-RAM alternatives. Whereas the popular Cell Split Mapping (CSM) for MLC STT-RAM leverages the inter-block nonuniform access frequency, the intra-block data access features remain untapped in the MLC design. Aiming to minimize the energy-hungry write request to Hard-Bit Line (HBL) and maximize the dynamic range in the advantageous Soft-Bit Line (SBL), an hybrid mapping strategy for MLC STT-RAM cache (Double-S) is advocated in the paper. Double-S couples the contemporary Cell-Split-Mapping with the novel Word-Split-Mapping (WSM). Sparse cache block detector and read depth based data allocation/ migration policy are proposed to release the full potential of Double-S.
Show less - Date Issued
- 2017
- Identifier
- CFE0006865, ucf:51751
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006865