Current Search: Ewetz, Rickard (x)
View All Items
- Title
- A deep learning approach to diagnosing schizophrenia.
- Creator
-
Barry, Justin, Valliyil Thankachan, Sharma, Gurupur, Varadraj, Jha, Sumit Kumar, Ewetz, Rickard, University of Central Florida
- Abstract / Description
-
In this article, the investigators present a new method using a deep learning approach to diagnose schizophrenia. In the experiment presented, the investigators have used a secondary dataset provided by National Institutes of Health. The aforementioned experimentation involves analyzing this dataset for existence of schizophrenia using traditional machine learning approaches such as logistic regression, support vector machine, and random forest. This is followed by application of deep...
Show moreIn this article, the investigators present a new method using a deep learning approach to diagnose schizophrenia. In the experiment presented, the investigators have used a secondary dataset provided by National Institutes of Health. The aforementioned experimentation involves analyzing this dataset for existence of schizophrenia using traditional machine learning approaches such as logistic regression, support vector machine, and random forest. This is followed by application of deep learning techniques using three hidden layers in the model. The results obtained indicate that deep learning provides state-of-the-art accuracy in diagnosing schizophrenia. Based on these observations, there is a possibility that deep learning may provide a paradigm shift in diagnosing schizophrenia.
Show less - Date Issued
- 2019
- Identifier
- CFE0007429, ucf:52737
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007429
- Title
- Bridging the Gap between Application and Solid-State-Drives.
- Creator
-
Zhou, Jian, Wang, Jun, Lin, Mingjie, Fan, Deliang, Ewetz, Rickard, Qi, GuoJun, University of Central Florida
- Abstract / Description
-
Data storage is one of the important and often critical parts of the computing systemin terms of performance, cost, reliability, and energy.Numerous new memory technologies,such as NAND flash, phase change memory (PCM), magnetic RAM (STT-RAM) and Memristor,have emerged recently.Many of them have already entered the production system.Traditional storage optimization and caching algorithms are far from optimalbecause storage I/Os do not show simple locality.To provide optimal storage we need...
Show moreData storage is one of the important and often critical parts of the computing systemin terms of performance, cost, reliability, and energy.Numerous new memory technologies,such as NAND flash, phase change memory (PCM), magnetic RAM (STT-RAM) and Memristor,have emerged recently.Many of them have already entered the production system.Traditional storage optimization and caching algorithms are far from optimalbecause storage I/Os do not show simple locality.To provide optimal storage we need accurate predictions of I/O behavior.However, the workloads are increasingly dynamic and diverse,making the long and short time I/O prediction challenge.Because of the evolution of the storage technologiesand the increasing diversity of workloads,the storage software is becoming more and more complex.For example, Flash Translation Layer (FTL) is added for NAND-flash based Solid State Disks (NAND-SSDs).However, it introduces overhead such as address translation delay and garbage collection costs.There are many recent studies aim to address the overhead.Unfortunately, there is no one-size-fits-all solution due to the variety of workloads.Despite rapidly evolving in storage technologies,the increasing heterogeneity and diversity in machines and workloadscoupled with the continued data explosionexacerbate the gap between computing and storage speeds.In this dissertation, we improve the data storage performance from both top-down and bottom-up approach.First, we will investigate exposing the storage level parallelismso that applications can avoid I/O contentions and workloads skewwhen scheduling the jobs.Second, we will study how architecture aware task scheduling can improve the performance of the application when PCM based NVRAM are equipped.Third, we will develop an I/O correlation aware flash translation layer for NAND-flash based Solid State Disks.Fourth, we will build a DRAM-based correlation aware FTL emulator and study the performance in various filesystems.
Show less - Date Issued
- 2018
- Identifier
- CFE0007273, ucf:52188
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007273
- Title
- Simulation, Analysis, and Optimization of Heterogeneous CPU-GPU Systems.
- Creator
-
Giles, Christopher, Heinrich, Mark, Ewetz, Rickard, Lin, Mingjie, Pattanaik, Sumanta, Flitsiyan, Elena, University of Central Florida
- Abstract / Description
-
With the computing industry's recent adoption of the Heterogeneous System Architecture (HSA) standard, we have seen a rapid change in heterogeneous CPU-GPU processor designs. State-of-the-art heterogeneous CPU-GPU processors tightly integrate multicore CPUs and multi-compute unit GPUs together on a single die. This brings the MIMD processing capabilities of the CPU and the SIMD processing capabilities of the GPU together into a single cohesive package with new HSA features comprising better...
Show moreWith the computing industry's recent adoption of the Heterogeneous System Architecture (HSA) standard, we have seen a rapid change in heterogeneous CPU-GPU processor designs. State-of-the-art heterogeneous CPU-GPU processors tightly integrate multicore CPUs and multi-compute unit GPUs together on a single die. This brings the MIMD processing capabilities of the CPU and the SIMD processing capabilities of the GPU together into a single cohesive package with new HSA features comprising better programmability, coherency between the CPU and GPU, shared Last Level Cache (LLC), and shared virtual memory address spaces. These advancements can potentially bring marked gains in heterogeneous processor performance and have piqued the interest of researchers who wish to unlock these potential performance gains. Therefore, in this dissertation I explore the heterogeneous CPU-GPU processor and application design space with the goal of answering interesting research questions, such as, (1) what are the architectural design trade-offs in heterogeneous CPU-GPU processors and (2) how do we best maximize heterogeneous CPU-GPU application performance on a given system. To enable my exploration of the heterogeneous CPU-GPU design space, I introduce a novel discrete event-driven simulation library called KnightSim and a novel computer architectural simulator called M2S-CGM. M2S-CGM includes all of the simulation elements necessary to simulate coherent execution between a CPU and GPU with shared LLC and shared virtual memory address spaces. I then utilize M2S-CGM for the conduct of three architectural studies. First, I study the architectural effects of shared LLC and CPU-GPU coherence on the overall performance of non-collaborative GPU-only applications. Second, I profile and analyze a set of collaborative CPU-GPU applications to determine how to best optimize them for maximum collaborative performance. Third, I study the impact of varying four key architectural parameters on collaborative CPU-GPU performance by varying GPU compute unit coalesce size, GPU to memory controller bandwidth, GPU frequency, and system wide switching fabric latency.
Show less - Date Issued
- 2019
- Identifier
- CFE0007807, ucf:52346
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007807
- Title
- Rethinking Routing and Peering in the era of Vertical Integration of Network Functions.
- Creator
-
Dey, Prasun, Yuksel, Murat, Wang, Jun, Ewetz, Rickard, Zhang, Wei, Hasan, Samiul, University of Central Florida
- Abstract / Description
-
Content providers typically control the digital content consumption services and are getting the most revenue by implementing an (")all-you-can-eat(") model via subscription or hyper-targeted advertisements. Revamping the existing Internet architecture and design, a vertical integration where a content provider and access ISP will act as unibody in a sugarcane form seems to be the recent trend. As this vertical integration trend is emerging in the ISP market, it is questionable if existing...
Show moreContent providers typically control the digital content consumption services and are getting the most revenue by implementing an (")all-you-can-eat(") model via subscription or hyper-targeted advertisements. Revamping the existing Internet architecture and design, a vertical integration where a content provider and access ISP will act as unibody in a sugarcane form seems to be the recent trend. As this vertical integration trend is emerging in the ISP market, it is questionable if existing routing architecture will suffice in terms of sustainable economics, peering, and scalability. It is expected that the current routing will need careful modifications and smart innovations to ensure effective and reliable end-to-end packet delivery. This involves new feature developments for handling traffic with reduced latency to tackle routing scalability issues in a more secure way and to offer new services at cheaper costs. Considering the fact that prices of DRAM or TCAM in legacy routers are not necessarily decreasing at the desired pace, cloud computing can be a great solution to manage the increasing computation and memory complexity of routing functions in a centralized manner with optimized expenses. Focusing on the attributes associated with existing routing cost models and by exploring a hybrid approach to SDN, we also compare recent trends in cloud pricing (for both storage and service) to evaluate whether it would be economically beneficial to integrate cloud services with legacy routing for improved cost-efficiency. In terms of peering, using the US as a case study, we show the overlaps between access ISPs and content providers to explore the viability of a future in terms of peering between the new emerging content-dominated sugarcane ISPs and the healthiness of Internet economics. To this end, we introduce meta-peering, a term that encompasses automation efforts related to peering (-) from identifying a list of ISPs likely to peer, to injecting control-plane rules, to continuous monitoring and notifying any violation (-) one of the many outcroppings of vertical integration procedure which could be offered to the ISPs as a standalone service.
Show less - Date Issued
- 2019
- Identifier
- CFE0007797, ucf:52351
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007797
- Title
- Autonomous Discovery and Maintenance of Mobile Frees-Space-Optical Links.
- Creator
-
Khan, Mahmudur, Yuksel, Murat, Pourmohammadi Fallah, Yaser, Ewetz, Rickard, Turgut, Damla, Nam, Boo Hyun, University of Central Florida
- Abstract / Description
-
Free-Space-Optical (FSO) communication has the potential to play a significant role in future generation wireless networks. It is advantageous in terms of improved spectrum utilization, higher data transfer rate, and lower probability of interception from unwanted sources. FSO communication can provide optical-level wireless communication speeds and can also help solve the wireless capacity problem experienced by the traditional RF-based technologies. Despite these advantages, communications...
Show moreFree-Space-Optical (FSO) communication has the potential to play a significant role in future generation wireless networks. It is advantageous in terms of improved spectrum utilization, higher data transfer rate, and lower probability of interception from unwanted sources. FSO communication can provide optical-level wireless communication speeds and can also help solve the wireless capacity problem experienced by the traditional RF-based technologies. Despite these advantages, communications using FSO transceivers require establishment and maintenance of line-of-sight (LOS). We consider autonomous mobile nodes (Unmanned Ground Vehicles or Unmanned Aerial Vehicles), each with one FSO transceiver mounted on a movable head capable of scanning in the horizontal and vertical planes. We propose novel schemes that deal with the problems of automatic discovery, establishment, and maintenance of LOS alignment between these nodes with mechanical steering of the directional FSO transceivers in 2-D and 3-D scenarios. We perform extensive simulations to show the effectiveness of the proposed methods for both neighbor discovery and LOS maintenance. We also present a prototype implementation of such mobile nodes with FSO transceivers. The potency of the neighbor discovery and LOS alignment protocols is evaluated by analyzing the results obtained from both simulations and experiments conducted using the prototype. The results show that, by using such mechanically steerable directional transceivers and the proposed methods, it is possible to establish optical wireless links within practical discovery times and maintain the links in a mobile setting with minimal disruption.
Show less - Date Issued
- 2018
- Identifier
- CFE0007575, ucf:52573
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007575
- Title
- Automated Synthesis of Unconventional Computing Systems.
- Creator
-
Hassen, Amad Ul, Jha, Sumit Kumar, Sundaram, Kalpathy, Fan, Deliang, Ewetz, Rickard, Rahman, Talat, University of Central Florida
- Abstract / Description
-
Despite decades of advancements, modern computing systems which are based on the von Neumann architecture still carry its shortcomings. Moore's law, which had substantially masked the effects of the inherent memory-processor bottleneck of the von Neumann architecture, has slowed down due to transistor dimensions nearing atomic sizes. On the other hand, modern computational requirements, driven by machine learning, pattern recognition, artificial intelligence, data mining, and IoT, are growing...
Show moreDespite decades of advancements, modern computing systems which are based on the von Neumann architecture still carry its shortcomings. Moore's law, which had substantially masked the effects of the inherent memory-processor bottleneck of the von Neumann architecture, has slowed down due to transistor dimensions nearing atomic sizes. On the other hand, modern computational requirements, driven by machine learning, pattern recognition, artificial intelligence, data mining, and IoT, are growing at the fastest pace ever. By their inherent nature, these applications are particularly affected by communication-bottlenecks, because processing them requires a large number of simple operations involving data retrieval and storage. The need to address the problems associated with conventional computing systems at the fundamental level has given rise to several unconventional computing paradigms. In this dissertation, we have made advancements for automated syntheses of two types of unconventional computing paradigms: in-memory computing and stochastic computing. In-memory computing circumvents the problem of limited communication bandwidth by unifying processing and storage at the same physical locations. The advent of nanoelectronic devices in the last decade has made in-memory computing an energy-, area-, and cost-effective alternative to conventional computing. We have used Binary Decision Diagrams (BDDs) for in-memory computing on memristor crossbars. Specifically, we have used Free-BDDs, a special class of binary decision diagrams, for synthesizing crossbars for flow-based in-memory computing. Stochastic computing is a re-emerging discipline with several times smaller area/power requirements as compared to conventional computing systems. It is especially suited for fault-tolerant applications like image processing, artificial intelligence, pattern recognition, etc. We have proposed a decision procedures-based iterative algorithm to synthesize Linear Finite State Machines (LFSM) for stochastically computing non-linear functions such as polynomials, exponentials, and hyperbolic functions.
Show less - Date Issued
- 2019
- Identifier
- CFE0007648, ucf:52462
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007648
- Title
- Automated Synthesis of Memristor Crossbar Networks.
- Creator
-
Chakraborty, Dwaipayan, Jha, Sumit Kumar, Leavens, Gary, Ewetz, Rickard, Valliyil Thankachan, Sharma, Xu, Mengyu, University of Central Florida
- Abstract / Description
-
The advancement of semiconductor device technology over the past decades has enabled the design of increasingly complex electrical and computational machines. Electronic design automation (EDA) has played a significant role in the design and implementation of transistor-based machines. However, as transistors move closer toward their physical limits, the speed-up provided by Moore's law will grind to a halt. Once again, we find ourselves on the verge of a paradigm shift in the computational...
Show moreThe advancement of semiconductor device technology over the past decades has enabled the design of increasingly complex electrical and computational machines. Electronic design automation (EDA) has played a significant role in the design and implementation of transistor-based machines. However, as transistors move closer toward their physical limits, the speed-up provided by Moore's law will grind to a halt. Once again, we find ourselves on the verge of a paradigm shift in the computational sciences as newer devices pave the way for novel approaches to computing. One of such devices is the memristor -- a resistor with non-volatile memory.Memristors can be used as junctional switches in crossbar circuits, which comprise of intersecting sets of vertical and horizontal nanowires. The major contribution of this dissertation lies in automating the design of such crossbar circuits -- doing a new kind of EDA for a new kind of computational machinery. In general, this dissertation attempts to answer the following questions:a. How can we synthesize crossbars for computing large Boolean formulas, up to 128-bit?b. How can we synthesize more compact crossbars for small Boolean formulas, up to 8-bit?c. For a given loop-free C program doing integer arithmetic, is it possible to synthesize an equivalent crossbar circuit?We have presented novel solutions to each of the above problems. Our new, proposed solutions resolve a number of significant bottlenecks in existing research, via the usage of innovative logic representation and artificial intelligence techniques. For large Boolean formulas (up to 128-bit), we have utilized Reduced Ordered Binary Decision Diagrams (ROBDDs) to automatically synthesize linearly growing crossbar circuits that compute them. This cutting edge approach towards flow-based computing has yielded state-of-the-art results. It is worth noting that this approach is scalable to n-bit Boolean formulas. We have made significant original contributions by leveraging artificial intelligence for automatic synthesis of compact crossbar circuits. This inventive method has been expanded to encompass crossbar networks with 1D1M (1-diode-1-memristor) switches, as well. The resultant circuits satisfy the tight constraints of the Feynman Grand Prize challenge and are able to perform 8-bit binary addition. A leading edge development for end-to-end computation with flow-based crossbars has been implemented, which involves methodical translation of loop-free C programs into crossbar circuits via automated synthesis. The original contributions described in this dissertation reflect the substantial progress we have made in the area of electronic design automation for synthesis of memristor crossbar networks.
Show less - Date Issued
- 2019
- Identifier
- CFE0007609, ucf:52528
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007609
- Title
- Improvement of Data-Intensive Applications Running on Cloud Computing Clusters.
- Creator
-
Ibrahim, Ibrahim, Bassiouni, Mostafa, Lin, Mingjie, Zhou, Qun, Ewetz, Rickard, Garibay, Ivan, University of Central Florida
- Abstract / Description
-
MapReduce, designed by Google, is widely used as the most popular distributed programmingmodel in cloud environments. Hadoop, an open-source implementation of MapReduce, is a data management framework on large cluster of commodity machines to handle data-intensive applications. Many famous enterprises including Facebook, Twitter, and Adobehave been using Hadoop for their data-intensive processing needs. Task stragglers in MapReduce jobs dramatically impede job execution on massive datasets in...
Show moreMapReduce, designed by Google, is widely used as the most popular distributed programmingmodel in cloud environments. Hadoop, an open-source implementation of MapReduce, is a data management framework on large cluster of commodity machines to handle data-intensive applications. Many famous enterprises including Facebook, Twitter, and Adobehave been using Hadoop for their data-intensive processing needs. Task stragglers in MapReduce jobs dramatically impede job execution on massive datasets in cloud computing systems. This impedance is due to the uneven distribution of input data and computation load among cluster nodes, heterogeneous data nodes, data skew in reduce phase, resource contention situations, and network configurations. All these reasons may cause delay failure and the violation of job completion time. One of the key issues that can significantly affect the performance of cloud computing is the computation load balancing among cluster nodes. Replica placement in Hadoop distributed file system plays a significant role in data availability and the balanced utilization of clusters. In the current replica placement policy (RPP) of Hadoop distributed file system (HDFS), the replicas of data blocks cannot be evenly distributed across cluster's nodes. The current HDFS must rely on a load balancing utility for balancing the distribution of replicas, which results in extra overhead for time and resources. This dissertation addresses data load balancing problem and presents an innovative replica placement policy for HDFS. It can perfectly balance the data load among cluster's nodes. The heterogeneity of cluster nodes exacerbates the issue of computational load balancing; therefore, another replica placement algorithm has been proposed in this dissertation for heterogeneous cluster environments. The timing of identifying the straggler map task is very important for straggler mitigation in data-intensive cloud computing. To mitigate the straggler map task, Present progress and Feedback based Speculative Execution (PFSE) algorithm has been proposed in this dissertation. PFSE is a new straggler identification scheme to identify the straggler map tasks based on the feedback information received from completed tasks beside the progress of the current running task. Straggler reduce task aggravates the violation of MapReduce job completion time. Straggler reduce task is typically the result of bad data partitioning during the reduce phase. The Hash partitioner employed by Hadoop may cause intermediate data skew, which results in straggler reduce task. In this dissertation a new partitioning scheme, named Balanced Data Clusters Partitioner (BDCP), is proposed to mitigate straggler reduce tasks. BDCP is based on sampling of input data and feedback information about the current processing task. BDCP can assist in straggler mitigation during the reduce phase and minimize the job completion time in MapReduce jobs. The results of extensive experiments corroborate that the algorithms and policies proposed in this dissertation can improve the performance of data-intensive applications running on cloud platforms.
Show less - Date Issued
- 2019
- Identifier
- CFE0007818, ucf:52804
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007818
- Title
- Towards High-Efficiency Data Management In the Next-Generation Persistent Memory System.
- Creator
-
Chen, Xunchao, Wang, Jun, Fan, Deliang, Lin, Mingjie, Ewetz, Rickard, Zhang, Shaojie, University of Central Florida
- Abstract / Description
-
For the sake of higher cell density while achieving near-zero standby power, recent research progress in Magnetic Tunneling Junction (MTJ) devices has leveraged Multi-Level Cell (MLC) configurations of Spin-Transfer Torque Random Access Memory (STT-RAM). However, in order to mitigate the write disturbance in an MLC strategy, data stored in the soft bit must be restored back immediately after the hard bit switching is completed. Furthermore, as the result of MTJ feature size scaling, the soft...
Show moreFor the sake of higher cell density while achieving near-zero standby power, recent research progress in Magnetic Tunneling Junction (MTJ) devices has leveraged Multi-Level Cell (MLC) configurations of Spin-Transfer Torque Random Access Memory (STT-RAM). However, in order to mitigate the write disturbance in an MLC strategy, data stored in the soft bit must be restored back immediately after the hard bit switching is completed. Furthermore, as the result of MTJ feature size scaling, the soft bit can be expected to become disturbed by the read sensing current, thus requiring an immediate restore operation to ensure the data reliability. In this paper, we design and analyze a novel Adaptive Restore Scheme for Write Disturbance (ARS-WD) and Read Disturbance (ARS-RD), respectively. ARS-WD alleviates restoration overhead by intentionally overwriting soft bit lines which are less likely to be read. ARS-RD, on the other hand, aggregates the potential writes and restore the soft bit line at the time of its eviction from higher level cache. Both of these two schemes are based on a lightweight forecasting approach for the future read behavior of the cache block. Our experimental results show substantial reduction in soft bit line restore operations. Moreover, ARS promotes advantages of MLC to provide a preferable L2 design alternative in terms of energy, area and latency product compared to SLC STT-RAM alternatives. Whereas the popular Cell Split Mapping (CSM) for MLC STT-RAM leverages the inter-block nonuniform access frequency, the intra-block data access features remain untapped in the MLC design. Aiming to minimize the energy-hungry write request to Hard-Bit Line (HBL) and maximize the dynamic range in the advantageous Soft-Bit Line (SBL), an hybrid mapping strategy for MLC STT-RAM cache (Double-S) is advocated in the paper. Double-S couples the contemporary Cell-Split-Mapping with the novel Word-Split-Mapping (WSM). Sparse cache block detector and read depth based data allocation/ migration policy are proposed to release the full potential of Double-S.
Show less - Date Issued
- 2017
- Identifier
- CFE0006865, ucf:51751
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006865