You are here
METADATA AND DATA MANAGEMENT IN HIGH PERFORMANCE FILE AND STORAGE SYSTEMS
- Date Issued:
- 2008
- Abstract/Description:
- With the advent of emerging "e-Science" applications, today's scientific research increasingly relies on petascale-and-beyond computing over large data sets of the same magnitude. While the computational power of supercomputers has recently entered the era of petascale, the performance of their storage system is far lagged behind by many orders of magnitude. This places an imperative demand on revolutionizing their underlying I/O systems, on which the management of both metadata and data is deemed to have significant performance implications. Prefetching/caching and data locality awareness optimizations, as conventional and effective management techniques for metadata and data I/O performance enhancement, still play their crucial roles in current parallel and distributed file systems. In this study, we examine the limitations of existing prefetching/caching techniques and explore the untapped potentials of data locality optimization techniques in the new era of petascale computing. For metadata I/O access, we propose a novel weighted-graph-based prefetching technique, built on both direct and indirect successor relationship, to reap performance benefit from prefetching specifically for clustered metadata serversan arrangement envisioned necessary for petabyte scale distributed storage systems. For data I/O access, we design and implement Segment-structured On-disk data Grouping and Prefetching (SOGP), a combined prefetching and data placement technique to boost the local data read performance for parallel file systems, especially for those applications with partially overlapped access patterns. One high-performance local I/O software package in SOGP work for Parallel Virtual File System in the number of about 2000 C lines was released to Argonne National Laboratory in 2007 for potential integration into the production mode.
Title: | METADATA AND DATA MANAGEMENT IN HIGH PERFORMANCE FILE AND STORAGE SYSTEMS. |
34 views
14 downloads |
---|---|---|
Name(s): |
Gu, Peng, Author Wang, Jun, Committee Chair University of Central Florida, Degree Grantor |
|
Type of Resource: | text | |
Date Issued: | 2008 | |
Publisher: | University of Central Florida | |
Language(s): | English | |
Abstract/Description: | With the advent of emerging "e-Science" applications, today's scientific research increasingly relies on petascale-and-beyond computing over large data sets of the same magnitude. While the computational power of supercomputers has recently entered the era of petascale, the performance of their storage system is far lagged behind by many orders of magnitude. This places an imperative demand on revolutionizing their underlying I/O systems, on which the management of both metadata and data is deemed to have significant performance implications. Prefetching/caching and data locality awareness optimizations, as conventional and effective management techniques for metadata and data I/O performance enhancement, still play their crucial roles in current parallel and distributed file systems. In this study, we examine the limitations of existing prefetching/caching techniques and explore the untapped potentials of data locality optimization techniques in the new era of petascale computing. For metadata I/O access, we propose a novel weighted-graph-based prefetching technique, built on both direct and indirect successor relationship, to reap performance benefit from prefetching specifically for clustered metadata serversan arrangement envisioned necessary for petabyte scale distributed storage systems. For data I/O access, we design and implement Segment-structured On-disk data Grouping and Prefetching (SOGP), a combined prefetching and data placement technique to boost the local data read performance for parallel file systems, especially for those applications with partially overlapped access patterns. One high-performance local I/O software package in SOGP work for Parallel Virtual File System in the number of about 2000 C lines was released to Argonne National Laboratory in 2007 for potential integration into the production mode. | |
Identifier: | CFE0002251 (IID), ucf:47826 (fedora) | |
Note(s): |
2008-08-01 Ph.D. Engineering and Computer Science, School of Electrical Engineering and Computer Science Doctorate This record was generated from author submitted information. |
|
Subject(s): |
Parallel File system storage I/O performance |
|
Persistent Link to This Record: | http://purl.flvc.org/ucf/fd/CFE0002251 | |
Restrictions on Access: | campus 2011-08-01 | |
Host Institution: | UCF |