Current Search: Wang, Chung-Ching (x)
View All Items
- Title
- Energy Efficient and Secure Wireless Sensor Networks Design.
- Creator
-
Attiah, Afraa, Zou, Changchun, Chatterjee, Mainak, Wang, Jun, Yuksel, Murat, Wang, Chung-Ching, University of Central Florida
- Abstract / Description
-
ABSTRACTWireless Sensor Networks (WSNs) are emerging technologies that have the ability to sense,process, communicate, and transmit information to a destination, and they are expected to have significantimpact on the efficiency of many applications in various fields. The resource constraintsuch as limited battery power, is the greatest challenge in WSNs design as it affects the lifetimeand performance of the network. An energy efficient, secure, and trustworthy system is vital whena WSN...
Show moreABSTRACTWireless Sensor Networks (WSNs) are emerging technologies that have the ability to sense,process, communicate, and transmit information to a destination, and they are expected to have significantimpact on the efficiency of many applications in various fields. The resource constraintsuch as limited battery power, is the greatest challenge in WSNs design as it affects the lifetimeand performance of the network. An energy efficient, secure, and trustworthy system is vital whena WSN involves highly sensitive information. Thus, it is critical to design mechanisms that are energyefficient and secure while at the same time maintaining the desired level of quality of service.Inspired by these challenges, this dissertation is dedicated to exploiting optimization and gametheoretic approaches/solutions to handle several important issues in WSN communication, includingenergy efficiency, latency, congestion, dynamic traffic load, and security. We present severalnovel mechanisms to improve the security and energy efficiency of WSNs. Two new schemes areproposed for the network layer stack to achieve the following: (a) to enhance energy efficiencythrough optimized sleep intervals, that also considers the underlying dynamic traffic load and (b)to develop the routing protocol in order to handle wasted energy, congestion, and clustering. Wealso propose efficient routing and energy-efficient clustering algorithms based on optimization andgame theory. Furthermore, we propose a dynamic game theoretic framework (i.e., hyper defense)to analyze the interactions between attacker and defender as a non-cooperative security game thatconsiders the resource limitation. All the proposed schemes are validated by extensive experimentalanalyses, obtained by running simulations depicting various situations in WSNs in orderto represent real-world scenarios as realistically as possible. The results show that the proposedschemes achieve high performance in different terms, such as network lifetime, compared with thestate-of-the-art schemes.
Show less - Date Issued
- 2018
- Identifier
- CFE0006971, ucf:51672
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006971
- Title
- Masquerading Techniques in IEEE 802.11 Wireless Local Area Networks.
- Creator
-
Nakhila, Omar, Zou, Changchun, Turgut, Damla, Bassiouni, Mostafa, Chatterjee, Mainak, Wang, Chung-Ching, University of Central Florida
- Abstract / Description
-
The airborne nature of wireless transmission offers a potential target for attackers to compromise IEEE 802.11 Wireless Local Area Network (WLAN). In this dissertation, we explore the current WLAN security threats and their corresponding defense solutions. In our study, we divide WLAN vulnerabilities into two aspects, client, and administrator. The client-side vulnerability investigation is based on examining the Evil Twin Attack (ETA) while our administrator side research targets Wi-Fi...
Show moreThe airborne nature of wireless transmission offers a potential target for attackers to compromise IEEE 802.11 Wireless Local Area Network (WLAN). In this dissertation, we explore the current WLAN security threats and their corresponding defense solutions. In our study, we divide WLAN vulnerabilities into two aspects, client, and administrator. The client-side vulnerability investigation is based on examining the Evil Twin Attack (ETA) while our administrator side research targets Wi-Fi Protected Access II (WPA2). Three novel techniques have been presented to detect ETA. The detection methods are based on (1) creating a secure connection to a remote server to detect the change of gateway's public IP address by switching from one Access Point (AP) to another. (2) Monitoring multiple Wi-Fi channels in a random order looking for specific data packets sent by the remote server. (3) Merging the previous solutions into one universal ETA detection method using Virtual Wireless Clients (VWCs). On the other hand, we present a new vulnerability that allows an attacker to force the victim's smartphone to consume data through the cellular network by starting the data download on the victim's cell phone without the victim's permission. A new scheme has been developed to speed up the active dictionary attack intensity on WPA2 based on two novel ideas. First, the scheme connects multiple VWCs to the AP at the same time-each VWC has its own spoofed MAC address. Second, each of the VWCs could try many passphrases using single wireless session. Furthermore, we present a new technique to avoid bandwidth limitation imposed by Wi-Fi hotspots. The proposed method creates multiple VWCs to access the WLAN. The combination of the individual bandwidth of each VWC results in an increase of the total bandwidth gained by the attacker. All proposal techniques have been implemented and evaluated in real-life scenarios.
Show less - Date Issued
- 2018
- Identifier
- CFE0007063, ucf:51979
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007063
- Title
- Techniques for boosting the performance in Content-Based Image Retrieval Systems.
- Creator
-
Yu, Ning, Hua, Kien, Hughes, Charles, Dutton, Ronald, Wang, Chung-Ching, University of Central Florida
- Abstract / Description
-
Content-Based Image Retrieval has been an active research area for decades. In a CBIR system, one or more images are used as query to search for similar images. The similarity is measured on the low level features, such as color, shape, edge, texture. First, each image is processed and visual features are extracted. Therefore each image becomes a point in the feature space. Then, if two images are close to each other in the feature space, they are considered similar. That is, the k nearest...
Show moreContent-Based Image Retrieval has been an active research area for decades. In a CBIR system, one or more images are used as query to search for similar images. The similarity is measured on the low level features, such as color, shape, edge, texture. First, each image is processed and visual features are extracted. Therefore each image becomes a point in the feature space. Then, if two images are close to each other in the feature space, they are considered similar. That is, the k nearest neighbors are considered the most similar images to the query image. In this K-Nearest Neighbor (k-NN) model, semantically similar images are assumed to be clustered together in a single neighborhood in the high-dimensional feature space. Unfortunately semantically similar images with different appearances are often clustered into distinct neighborhoods, which might scatter in the feature space. Hence, confinement of the search results to a single neighborhood is the latent reason of the low recall rate of typical nearest neighbor techniques. In this dissertation, a new image retrieval technique - the Query Decomposition (QD) model is introduced. QD facilitates retrieval of semantically similar images from multiple neighborhoods in the feature space and hence bridges the semantic gap between the images' low-level feature and the high-level semantic meaning. In the QD model, a query may be decomposed into multiple subqueries based on the user's relevance feedback to cover multiple image clusters which contain semantically similar images. The retrieval results are the k most similar images from multiple discontinuous relevant clusters. To apply the benefit from QD study, a mobile client-side relevance feedback study was conducted. With the proliferation of handheld devices, the demand of multimedia information retrieval on mobile devices has attracted more attention. A relevance feedback information retrieval process usually includes several rounds of query refinement. Each round incurs exchange of tens of images between the mobile device and the server. With limited wireless bandwidth, this process can incur substantial delay making the system unfriendly to use. The Relevance Feedback Support (RFS) structure that was designed in QD technique was adopted for Client-side Relevance Feedback (CRF). Since relevance feedback is done on client side, system response is instantaneous significantly enhancing system usability. Furthermore, since the server is not involved in relevance feedback processing, it is able to support thousands more users simultaneously. As the QD technique improves on the accuracy of CBIR systems, another study, which is called In-Memory relevance feedback is studied in this dissertation. In the study, we improved the efficiency of the CBIR systems. Current methods rely on searching the database, stored on disks, in each round of relevance feedback. This strategy incurs long delay making relevance feedback less friendly to the user, especially for very large databases. Thus, scalability is a limitation of existing solutions. The proposed in-memory relevance feedback technique substantially reduce the delay associated with feedback processing, and therefore improve system usability. A data-independent dimensionality-reduction technique is used to compress the metadata to build a small in-memory database to support relevance feedback operations with minimal disk accesses. The performance of this approach is compared with conventional relevance feedback techniques in terms of computation efficiency and retrieval accuracy. The results indicate that the new technique substantially reduces response time for user feedback while maintaining the quality of the retrieval. In the previous studies, the QD technique relies on a pre-defined Relevance SupportSupport structure. As the result and user experience indicated that the structure might confine the search range and affect the result. In this dissertation, a novel Multiple Direction Search framework for semi-automatic annotation propagation is studied. In this system, the user interacts with the system to provide example images and the corresponding annotations during the annotation propagation process. In each iteration, the example images are dynamically clustered and the corresponding annotations are propagated separately to each cluster: images in the local neighborhood are annotated. Furthermore, some of those images are returned to the user for further annotation. As the user marks more images, the annotation process goes into multiple directions in the feature space. The query movements can be treated as multiple path navigation. Each path could be further split based on the user's input. In this manner, the system provides accurate annotation assistance to the user - images with the same semantic meaning but different visual characteristics can be handled effectively. From comprehensive experiments on Corel and U. of Washington image databases, the proposed technique shows accuracy and efficiency on annotating image databases.
Show less - Date Issued
- 2011
- Identifier
- CFE0004182, ucf:49058
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0004182
- Title
- Networking and security solutions for VANET initial deployment stage.
- Creator
-
Aslam, Baber, Zou, Changchun, Turgut, Damla, Bassiouni, Mostafa, Wang, Chung-Ching, University of Central Florida
- Abstract / Description
-
Vehicular ad hoc network (VANET) is a special case of mobile networks, where vehicles equipped with computing/communicating devices (called (")smart vehicles(")) are the mobile wireless nodes. However, the movement pattern of these mobile wireless nodes is no more random, as in case of mobile networks, rather it is restricted to roads and streets. Vehicular networks have hybrid architecture; it is a combination of both infrastructure and infrastructure-less architectures. The direct vehicle...
Show moreVehicular ad hoc network (VANET) is a special case of mobile networks, where vehicles equipped with computing/communicating devices (called (")smart vehicles(")) are the mobile wireless nodes. However, the movement pattern of these mobile wireless nodes is no more random, as in case of mobile networks, rather it is restricted to roads and streets. Vehicular networks have hybrid architecture; it is a combination of both infrastructure and infrastructure-less architectures. The direct vehicle to vehicle (V2V) communication is infrastructure-less or ad hoc in nature. Here the vehicles traveling within communication range of each other form an ad hoc network. On the other hand, the vehicle to infrastructure (V2I) communication has infrastructure architecture where vehicles connect to access points deployed along roads. These access points are known as road side units (RSUs) and vehicles communicate with other vehicles/wired nodes through these RSUs. To provide various services to vehicles, RSUs are generally connected to each other and to the Internet. The direct RSU to RSU communication is also referred as I2I communication. The success of VANET depends on the existence of pervasive roadside infrastructure and sufficient number of smart vehicles. Most VANET applications and services are based on either one or both of these requirements. A fully matured VANET will have pervasive roadside network and enough vehicle density to enable VANET applications. However, the initial deployment stage of VANET will be characterized by the lack of pervasive roadside infrastructure and low market penetration of smart vehicles. It will be economically infeasible to initially install a pervasive and fully networked roadside infrastructure, which could result in the failure of applications and services that depend on V2I or I2I communications. Further, low market penetration means there are insufficient number of smart vehicles to enable V2V communication, which could result in failure of services and applications that depend on V2V communications. Non-availability of pervasive connectivity to certification authorities and dynamic locations of each vehicle will make it difficult and expensive to implement security solutions that are based on some central certificate management authority. Non-availability of pervasive connectivity will also affect the backend connectivity of vehicles to the Internet or the rest of the world. Due to economic considerations, the installation of roadside infrastructure will take a long time and will be incremental thus resulting in a heterogeneous infrastructure with non-consistent capabilities. Similarly, smart vehicles will also have varying degree of capabilities. This will result in failure of applications and services that have very strict requirements on V2I or V2V communications. We have proposed several solutions to overcome the challenges described above that will be faced during the initial deployment stage of VANET. Specifically, we have proposed: 1) a VANET architecture that can provide services with limited number of heterogeneous roadside units and smart vehicles with varying capabilities, 2) a backend connectivity solution that provides connectivity between the Internet and smart vehicles without requiring pervasive roadside infrastructure or large number of smart vehicles, 3) a security architecture that does not depend on pervasive roadside infrastructure or a fully connected V2V network and fulfills all the security requirements, and 4) optimization solutions for placement of a limited number of RSUs within a given area to provide best possible service to smart vehicles. The optimal placement solutions cover both urban areas and highways environments.
Show less - Date Issued
- 2012
- Identifier
- CFE0004186, ucf:48993
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0004186
- Title
- Batch and Online Implicit Weighted Gaussian Processes for Robust Novelty Detection.
- Creator
-
Ramirez Padron, Ruben, Gonzalez, Avelino, Georgiopoulos, Michael, Stanley, Kenneth, Mederos, Boris, Wang, Chung-Ching, University of Central Florida
- Abstract / Description
-
This dissertation aims mainly at obtaining robust variants of Gaussian processes (GPs) that do not require using non-Gaussian likelihoods to compensate for outliers in the training data. Bayesian kernel methods, and in particular GPs, have been used to solve a variety of machine learning problems, equating or exceeding the performance of other successful techniques. That is the case of a recently proposed approach to GP-based novelty detection that uses standard GPs (i.e. GPs employing...
Show moreThis dissertation aims mainly at obtaining robust variants of Gaussian processes (GPs) that do not require using non-Gaussian likelihoods to compensate for outliers in the training data. Bayesian kernel methods, and in particular GPs, have been used to solve a variety of machine learning problems, equating or exceeding the performance of other successful techniques. That is the case of a recently proposed approach to GP-based novelty detection that uses standard GPs (i.e. GPs employing Gaussian likelihoods). However, standard GPs are sensitive to outliers in training data, and this limitation carries over to GP-based novelty detection. This limitation has been typically addressed by using robust non-Gaussian likelihoods. However, non-Gaussian likelihoods lead to analytically intractable inferences, which require using approximation techniques that are typically complex and computationally expensive. Inspired by the use of weights in quasi-robust statistics, this work introduces a particular type of weight functions, called here data weighers, in order to obtain robust GPs that do not require approximation techniques and retain the simplicity of standard GPs. This work proposes implicit weighted variants of batch GP, online GP, and sparse online GP (SOGP) that employ weighted Gaussian likelihoods. Mathematical expressions for calculating the posterior implicit weighted GPs are derived in this work. In our experiments, novelty detection based on our weighted batch GPs consistently and significantly outperformed standard batch GP-based novelty detection whenever data was contaminated with outliers. Additionally, our experiments show that novelty detection based on online GPs can perform similarly to batch GP-based novelty detection. Membership scores previously introduced by other authors are also compared in our experiments.
Show less - Date Issued
- 2015
- Identifier
- CFE0005869, ucf:50858
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0005869
- Title
- Antecedents of Emotional Labor and Job Satisfaction in the Hospitality Industry.
- Creator
-
Shapoval, Valeriya, Pizam, Abraham, Murphy, Kevin, Kwun, David, Wang, Chung-Ching, Joseph, Dana, University of Central Florida
- Abstract / Description
-
It is a general policy in the hotel industry that all the service should be provided in the friendly and a professional manner. The first smile of a front desk clerk or a wait staff can make a difference in customer satisfaction and loyalty. A service quality is becoming more important with increase of competitiveness among hotels and hotel brands. A process of regulating positive emotions for an organization is called Emotional Labor (EL) (Grandey, 2000). While essential for the hospitality...
Show moreIt is a general policy in the hotel industry that all the service should be provided in the friendly and a professional manner. The first smile of a front desk clerk or a wait staff can make a difference in customer satisfaction and loyalty. A service quality is becoming more important with increase of competitiveness among hotels and hotel brands. A process of regulating positive emotions for an organization is called Emotional Labor (EL) (Grandey, 2000). While essential for the hospitality industry, empirical research on EL is very limited, and research on EL during stressful situations is almost nonexistent. To reduce the gap in the prior research, this study is looking into dynamics of a perceived organizational and customer (in) justice as a stress factor on an employee's EL and subsequent job satisfaction. To further understand dynamics of the proposed model, variables such as a gender and intensity of interaction were used as moderating effects. This study extended research done by Spencer and Rupp (2006, 2009) on employees' perceived customer injustice and its effects on employees' EL. This study drew on fairness, effective events, referent cognition, social exchange and action theories to explain why individuals' EL is impacted by injustice extended by guests and organization. Four types of organizational justice (procedural, distributive, interpersonal and informational) were used in this research. The results of the study indicated that employees EL (effort, dissonance) increases with increased effects of distributive (in) justice. EL dissonance had a significant negative effect on job satisfaction and EL effort had a significant positive effect on a job satisfaction. Finally, procedural (in) justice and informational (in) justice had a higher effects on male employees rather than their female counterparts. Since this study is first to explore effects of four facets of organizational (in) justice on employees EL, job satisfaction and gender as moderating effects, this study offers multiple theoretical and managerial implication for evaluation of EL and its antecedents in the hospitality industry.
Show less - Date Issued
- 2016
- Identifier
- CFE0006393, ucf:51505
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006393
- Title
- Hashing for Multimedia Similarity Modeling and Large-Scale Retrieval.
- Creator
-
Li, Kai, Hua, Kien, Qi, GuoJun, Hu, Haiyan, Wang, Chung-Ching, University of Central Florida
- Abstract / Description
-
In recent years, the amount of multimedia data such as images, texts, and videos have been growing rapidly on the Internet. Motivated by such trends, this thesis is dedicated to exploiting hashing-based solutions to reveal multimedia data correlations and support intra-media and inter-media similarity search among huge volumes of multimedia data.We start by investigating a hashing-based solution for audio-visual similarity modeling and apply it to the audio-visual sound source localization...
Show moreIn recent years, the amount of multimedia data such as images, texts, and videos have been growing rapidly on the Internet. Motivated by such trends, this thesis is dedicated to exploiting hashing-based solutions to reveal multimedia data correlations and support intra-media and inter-media similarity search among huge volumes of multimedia data.We start by investigating a hashing-based solution for audio-visual similarity modeling and apply it to the audio-visual sound source localization problem. We show that synchronized signals in audio and visual modalities demonstrate similar temporal changing patterns in certain feature spaces. We propose to use a permutation-based random hashing technique to capture the temporal order dynamics of audio and visual features by hashing them along the temporal axis into a common Hamming space. In this way, the audio-visual correlation problem is transformed into a similarity search problem in the Hamming space. Our hashing-based audio-visual similarity modeling has shown superior performances in the localization and segmentation of sounding objects in videos.The success of the permutation-based hashing method motivates us to generalize and formally define the supervised ranking-based hashing problem, and study its application to large-scale image retrieval. Specifically, we propose an effective supervised learning procedure to learn optimized ranking-based hash functions that can be used for large-scale similarity search. Compared with the randomized version, the optimized ranking-based hash codes are much more compact and discriminative. Moreover, it can be easily extended to kernel space to discover more complex ranking structures that cannot be revealed in linear subspaces. Experiments on large image datasets demonstrate the effectiveness of the proposed method for image retrieval.We further studied the ranking-based hashing method for the cross-media similarity search problem. Specifically, we propose two optimization methods to jointly learn two groups of linear subspaces, one for each media type, so that features' ranking orders in different linear subspaces maximally preserve the cross-media similarities. Additionally, we develop this ranking-based hashing method in the cross-media context into a flexible hashing framework with a more general solution. We have demonstrated through extensive experiments on several real-world datasets that the proposed cross-media hashing method can achieve superior cross-media retrieval performances against several state-of-the-art algorithms.Lastly, to make better use of the supervisory label information, as well as to further improve the efficiency and accuracy of supervised hashing, we propose a novel multimedia discrete hashing framework that optimizes an instance-wise loss objective, as compared to the pairwise losses, using an efficient discrete optimization method. In addition, the proposed method decouples the binary codes learning and hash function learning into two separate stages, thus making the proposed method equally applicable for both single-media and cross-media search. Extensive experiments on both single-media and cross-media retrieval tasks demonstrate the effectiveness of the proposed method.
Show less - Date Issued
- 2017
- Identifier
- CFE0006759, ucf:51840
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006759
- Title
- Opportunistic Spectrum Utilization by Cognitive Radio Networks: Challenges and Solutions.
- Creator
-
Amjad, Muhammad Faisal, Zou, Changchun, Bassiouni, Mostafa, Turgut, Damla, Wang, Chung-Ching, University of Central Florida
- Abstract / Description
-
Cognitive Radio Network (CRN) is an emerging paradigm that makes use of Dynamic Spectrum Access (DSA) to communicate opportunistically, in the un-licensed Industrial, Scientific and Medical bands or frequency bands otherwise licensed to incumbent users such as TV broadcast. Interest in the development of CRNs is because of severe under-utilization of spectrum bands by the incumbent Primary Users (PUs) that have the license to use them coupled with an ever-increasing demand for unlicensed...
Show moreCognitive Radio Network (CRN) is an emerging paradigm that makes use of Dynamic Spectrum Access (DSA) to communicate opportunistically, in the un-licensed Industrial, Scientific and Medical bands or frequency bands otherwise licensed to incumbent users such as TV broadcast. Interest in the development of CRNs is because of severe under-utilization of spectrum bands by the incumbent Primary Users (PUs) that have the license to use them coupled with an ever-increasing demand for unlicensed spectrum for a variety of new mobile and wireless applications. The essence of Cognitive Radio (CR) operation is the cooperative and opportunistic utilization of licensed spectrum bands by the Secondary Users (SUs) that collectively form the CRN without causing any interference to PUs' communications.CRN operation is characterized by factors such as network-wide quiet periods for cooperative spectrum sensing, opportunistic/dynamic spectrum access and non-deterministic operation of PUs. These factors can have a devastating impact on the overall throughput and can significantly increase the control overheads. Therefore, to support the same level of QoS as traditional wireless access technologies, very closer interaction is required between layers of the protocol stack.Opportunistic spectrum utilization without causing interference to the PUs is only possible if the SUs periodically sense the spectrum for the presence of PUs' signal. To minimize the effects of hardware capabilities, terrain features and PUs' transmission ranges, DSA is undertaken in a collaborative manner where SUs periodically carry out spectrum sensing in their respective geographical locations. Collaborative spectrum sensing has numerous security loopholes and canbe favorable to malicious nodes in the network that may exploit vulnerabilities associated with DSA such as launching a spectrum sensing data falsification (SSDF) attack. Some CRN standards such as the IEEE 802.22 wireless regional area network employ a two-stage quiet period mechanism based on a mandatory Fast Sensing and an optional Fine Sensing stage for DSA. This arrangement is meant to strike a balance between the conflicting goals of proper protection of incumbent PUs' signals and optimum QoS for SUs so that only as much time is spent for spectrum sensing as needed. Malicious nodes in the CRN however, can take advantage of the two-stage spectrum sensing mechanism to launch smart denial of service (DoS) jamming attacks on CRNs during the fast sensing stage.Coexistence protocols enable collocated CRNs to contend for and share the available spectrum. However, most coexistence protocols do not take into consideration the fact that channels of the available spectrum can be heterogeneous in the sense that they can vary in their characteristics and quality such as SNR or bandwidth. Without any mechanism to enforce fairness in accessing varying quality channels, ensuring coexistence with minimal contention and efficient spectrum utilization for CRNs is likely to become a very difficult task.The cooperative and opportunistic nature of communication has many challenges associated with CRNs' operation. In view of the challenges described above, this dissertation presents solutions including cross-layer approaches, reputation system, optimization and game theoretic approaches to handle (1) degradation in TCP's throughput resulting from packet losses and disruptions in spectrum availability due non-deterministic use of spectrum by the PUs (2) presence of malicious SUs in the CRN that may launch various attacks on CRNs' includingSSDF and jamming and (3) sharing of heterogeneous spectrum resources among collocated CRNs without a centralized mechanism to enforce cooperation among otherwise non-cooperative CRNs
Show less - Date Issued
- 2015
- Identifier
- CFE0005571, ucf:50249
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0005571
- Title
- Research on High-performance and Scalable Data Access in Parallel Big Data Computing.
- Creator
-
Yin, Jiangling, Wang, Jun, Jin, Yier, Lin, Mingjie, Qi, GuoJun, Wang, Chung-Ching, University of Central Florida
- Abstract / Description
-
To facilitate big data processing, many dedicated data-intensive storage systems such as Google File System(GFS), Hadoop Distributed File System(HDFS) and Quantcast File System(QFS) have been developed. Currently, the Hadoop Distributed File System(HDFS) [20] is the state-of-art and most popular open-source distributed file system for big data processing. It is widely deployed as the bedrock for many big data processing systems/frameworks, such as the script-based pig system, MPI-based...
Show moreTo facilitate big data processing, many dedicated data-intensive storage systems such as Google File System(GFS), Hadoop Distributed File System(HDFS) and Quantcast File System(QFS) have been developed. Currently, the Hadoop Distributed File System(HDFS) [20] is the state-of-art and most popular open-source distributed file system for big data processing. It is widely deployed as the bedrock for many big data processing systems/frameworks, such as the script-based pig system, MPI-based parallel programs, graph processing systems and scala/java-based Spark frameworks. These systems/applications employ parallel processes/executors to speed up data processing within scale-out clusters.Job or task schedulers in parallel big data applications such as mpiBLAST and ParaView can maximize the usage of computing resources such as memory and CPU by tracking resource consumption/availability for task assignment. However, since these schedulers do not take the distributed I/O resources and global data distribution into consideration, the data requests from parallel processes/executors in big data processing will unfortunately be served in an imbalanced fashion on the distributed storage servers. These imbalanced access patterns among storage nodes are caused because a). unlike conventional parallel file system using striping policies to evenly distribute data among storage nodes, data-intensive file systems such as HDFS store each data unit, referred to as chunk or block file, with several copies based on a relative random policy, which can result in an uneven data distribution among storage nodes; b). based on the data retrieval policy in HDFS, the more data a storage node contains, the higher the probability that the storage node could be selected to serve the data. Therefore, on the nodes serving multiple chunk files, the data requests from different processes/executors will compete for shared resources such as hard disk head and network bandwidth. Because of this, the makespan of the entire program could be significantly prolonged and the overall I/O performance will degrade.The first part of my dissertation seeks to address aspects of these problems by creating an I/O middleware system and designing matching-based algorithms to optimize data access in parallel big data processing. To address the problem of remote data movement, we develop an I/O middleware system, called SLAM, which allows MPI-based analysis and visualization programs to benefit from locality read, i.e, each MPI process can access its required data from a local or nearby storage node. This can greatly improve the execution performance by reducing the amount of data movement over network. Furthermore, to address the problem of imbalanced data access, we propose a method called Opass, which models the data read requests that are issued by parallel applications to cluster nodes as a graph data structure where edges weights encode the demands of load capacity. We then employ matching-based algorithms to map processes to data to achieve data access in a balanced fashion. The final part of my dissertation focuses on optimizing sub-dataset analyses in parallel big data processing. Our proposed methods can benefit different analysis applications with various computational requirements and the experiments on different cluster testbeds show their applicability and scalability.
Show less - Date Issued
- 2015
- Identifier
- CFE0006021, ucf:51008
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006021
- Title
- Evaluation of crash modification factors and functions including time trends at intersections.
- Creator
-
Wang, Jung-Han, Abdel-Aty, Mohamed, Radwan, Essam, Eluru, Naveen, Lee, JaeYoung, Wang, Chung-Ching, University of Central Florida
- Abstract / Description
-
Traffic demand has increased as population increased. The US population reached 313,914,040 in 2012 (US Census Bureau, 2015). Increased travel demand may have potential impact on roadway safety and the operational characteristics of roadways. Total crashes and injury crashes at intersections accounted for 40% and 44% of traffic crashes, respectively, on US roadways in 2007 according to the Intersection Safety Issue Brief (FHWA, 2009). Traffic researchers and engineers have developed a...
Show moreTraffic demand has increased as population increased. The US population reached 313,914,040 in 2012 (US Census Bureau, 2015). Increased travel demand may have potential impact on roadway safety and the operational characteristics of roadways. Total crashes and injury crashes at intersections accounted for 40% and 44% of traffic crashes, respectively, on US roadways in 2007 according to the Intersection Safety Issue Brief (FHWA, 2009). Traffic researchers and engineers have developed a quantitative measure of the safety effectiveness of treatments in the form of crash modification factors (CMF). Based on CMFs from multiple studies, the Highway Safety Manual (HSM) Part D (AASHTO, 2010) provides CMFs which can be used to determine the expected number of crash reduction or increase after treatments were installed. Even though CMFs have been introduced in the HSM, there are still limitations that require to be investigated. One important potential limitation is that the HSM provides various CMFs as fixed values, rather than CMFs under different configurations. In this dissertation, the CMFs were estimated using the observational before-after study to show that the CMFs vary across different traffic volume levels when signalizing intersections. Besides screening the effect of traffic volume, previous studies showed that CMFs could vary over time after the treatment was implemented. Thus, in this dissertation, the trends of CMFs for the signalization and adding red light running cameras (RLCs) were evaluated. CMFs for these treatments were measured in each month and 90- day moving windows using the time series ARMA model. The results of the signalization show that the CMFs for rear-end crashes were lower at the early phase after the signalization but gradually increased from the 9th month. Besides, it was also found that the safety effectiveness is significantly worse 18 months after installing RLCs.Although efforts have been made to seek reliable CMFs, the best estimate of CMFs is still widely debated. Since CMFs are non-zero estimates, the population of all CMFs does not follow normal distributions and even if it did, the true mean of CMFs at some intersections may be different than that at others. Therefore, a bootstrap method was proposed to estimate CMFs that makes no distributional assumptions. Through examining the distribution of CMFs estimated by bootstrapped resamples, a CMF precision rating method is suggested to evaluate the reliability of the estimated CMFs. The result shows that the estimated CMF for angle+left-turn crashes after signalization has the highest precision, while estimates of the CMF for rear-end crashes are extremely unreliable. The CMFs for KABCO, KABC, and KAB crashes proved to be reliable for the majority of intersections, but the estimated effect of signalization may not be accurate at some sites.In addition, the bootstrap method provides a quantitative measure to identify the reliability of CMFs, however, the CMF transferability is questionable. Since the development of CMFs requires safety performance functions (SPFs), could CMFs be developed using the SPFs from other states in the United States? This research applies the empirical Bayes method to develop CMFs using several SPFs from different jurisdictions and adjusted by calibration factors. After examination, it is found that applying SPFs from other jurisdictions is not desired when developing CMFs.The process of estimating CMFs using before-after studies requires the understanding of multiple statistical principles. In order to simplify the process of CMF estimation and make the CMFs research reproducible. This dissertation includes an open source statistics package built in R (R, 2013) to make the estimation accessible and reproducible. With this package, authorities are able to estimate reliable CMFs following the procedure suggested by FHWA. In addition, this software package equips a graphical interface which integrates the algorithm of calculating CMFs so that users can perform CMF calculation with minimum programming prerequisite. Expected contributions of this study are to 1) propose methodologies for CMFs to assess the variation of CMFs with different characteristics among treated sites, 2) suggest new objective criteria to judge the reliability of safety estimation, 3) examine the transferability of SPFs when developing CMF using before-after studies, and 4) develop a statistics software to calculate CMFs. Finally, potential relevant applications beyond the scope of this research, but worth investigation in the future are discussed in this dissertation.
Show less - Date Issued
- 2016
- Identifier
- CFE0006413, ucf:51454
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006413
- Title
- Exploration and development of crash modification factors and functions for single and multiple treatments.
- Creator
-
Park, Juneyoung, Abdel-Aty, Mohamed, Radwan, Essam, Eluru, Naveen, Wang, Chung-Ching, Lee, JaeYoung, University of Central Florida
- Abstract / Description
-
Traffic safety is a major concern for the public, and it is an important component of the roadway management strategy. In order to improve highway safety, extensive efforts have been made by researchers, transportation engineers, Federal, State, and local government officials. With these consistent efforts, both fatality and injury rates from road traffic crashes in the United States have been steadily declining over the last six years (2006~2011). However, according to the National Highway...
Show moreTraffic safety is a major concern for the public, and it is an important component of the roadway management strategy. In order to improve highway safety, extensive efforts have been made by researchers, transportation engineers, Federal, State, and local government officials. With these consistent efforts, both fatality and injury rates from road traffic crashes in the United States have been steadily declining over the last six years (2006~2011). However, according to the National Highway Traffic Safety Administration (NHTSA, 2013), 33,561 people died in motor vehicle traffic crashes in the United States in 2012, compared to 32,479 in 2011, and it is the first increase in fatalities since 2005. Moreover, in 2012, an estimated 2.36 million people were injured in motor vehicle traffic crashes, compared to 2.22 million in 2011. Due to the demand of highway safety improvements through systematic analysis of specific roadway cross-section elements and treatments, the Highway Safety Manual (HSM) (AASHTO, 2010) was developed by the Transportation Research Board (TRB) to introduce a science-based technical approach for safety analysis. One of the main parts in the HSM, Part D, contains crash modification factors (CMFs) for various treatments on roadway segments and at intersections. A CMF is a factor that can estimate potential changes in crash frequency as a result of implementing a specific treatment (or countermeasure). CMFs in Part D have been developed using high-quality observational before-after studies that account for the regression to the mean threat. Observational before-after studies are the most common methods for evaluating safety effectiveness and calculating CMFs of specific roadway treatments. Moreover, cross-sectional method has commonly been used to derive CMFs since it is easier to collect the data compared to before-after methods.Although various CMFs have been calculated and introduced in the HSM, still there are critical limitations that are required to be investigated. First, the HSM provides various CMFs for single treatments, but not CMFs for multiple treatments to roadway segments. The HSM suggests that CMFs are multiplied to estimate the combined safety effects of single treatments. However, the HSM cautions that the multiplication of the CMFs may over- or under-estimate combined effects of multiple treatments. In this dissertation, several methodologies are proposed to estimate more reliable combined safety effects in both observational before-after studies and the cross-sectional method. Averaging two best combining methods is suggested to use to account for the effects of over- or under- estimation. Moreover, it is recommended to develop adjustment factor and function (i.e. weighting factor and function) to apply to estimate more accurate safety performance in assessing safety effects of multiple treatments. The multivariate adaptive regression splines (MARS) modeling is proposed to avoid the over-estimation problem through consideration of interaction impacts between variables in this dissertation. Second, the variation of CMFs with different roadway characteristics among treated sites over time is ignored because the CMF is a fixed value that represents the overall safety effect of the treatment for all treated sites for specific time periods. Recently, few studies developed crash modification functions (CMFunctions) to overcome this limitation. However, although previous studies assessed the effect of a specific single variable such as AADT on the CMFs, there is a lack of prior studies on the variation in the safety effects of treated sites with different multiple roadway characteristics over time. In this study, adopting various multivariate linear and nonlinear modeling techniques is suggested to develop CMFunctions. Multiple linear regression modeling can be utilized to consider different multiple roadway characteristics. To reflect nonlinearity of predictors, a regression model with nonlinearizing link function needs to be developed. The Bayesian approach can also be adopted due to its strength to avoid the problem of over fitting that occurs when the number of observations is limited and the number of variables is large. Moreover, two data mining techniques (i.e. gradient boosting and MARS) are suggested to use 1) to achieve better performance of CMFunctions with consideration of variable importance, and 2) to reflect both nonlinear trend of predictors and interaction impacts between variables at the same time. Third, the nonlinearity of variables in the cross-sectional method is not discussed in the HSM. Generally, the cross-sectional method is also known as safety performance functions (SPFs) and generalized linear model (GLM) is applied to estimate SPFs. However, the estimated CMFs from GLM cannot account for the nonlinear effect of the treatment since the coefficients in the GLM are assumed to be fixed. In this dissertation, applications of using generalized nonlinear model (GNM) and MARS in the cross-sectional method are proposed. In GNMs, the nonlinear effects of independent variables to crash analysis can be captured by the development of nonlinearizing link function. Moreover, the MARS accommodate nonlinearity of independent variables and interaction effects for complex data structures. In this dissertation, the CMFs and CMFunctions are estimated for various single and combination of treatments for different roadway types (e.g. rural two-lane, rural multi-lane roadways, urban arterials, freeways, etc.) as below:1) Treatments for mainline of roadway: - adding a thru lane, conversion of 4-lane undivided roadways to 3-lane with two-way left turn lane (TWLTL)2) Treatments for roadway shoulder: - installing shoulder rumble strips, widening shoulder width, adding bike lanes, changing bike lane width, installing roadside barriers3) Treatments related to roadside features: - decrease density of driveways, decrease density of roadside poles, increase distance to roadside poles, increase distance to trees Expected contributions of this study are to 1) suggest approaches to estimate more reliable safety effects of multiple treatments, 2) propose methodologies to develop CMFunctions to assess the variation of CMFs with different characteristics among treated sites, and 3) recommend applications of using GNM and MARS to simultaneously consider the interaction impact of more than one variables and nonlinearity of predictors.Finally, potential relevant applications beyond the scope of this research but worth investigation in the future are discussed in this dissertation.
Show less - Date Issued
- 2015
- Identifier
- CFE0005861, ucf:50914
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0005861
- Title
- Dynamic Hotspot Identification for Limited Access Facilities using Temporal Traffic Data.
- Creator
-
Al Amili, Samer, Abdel-Aty, Mohamed, Radwan, Essam, Eluru, Naveen, Lee, JaeYoung, Wang, Chung-Ching, University of Central Florida
- Abstract / Description
-
Crash frequency analysis is the most critical tool to investigate traffic safety problems. Therefore, an accurate crash analysis must be conducted. Since traffic continually fluctuates over time and this effects potential of crash occurrence, shorter time periods and less aggregated traffic factors (shorter intervals than AADT) need to be used. In this dissertation, several methodologies have been conducted to elevate the accuracy of crash prediction. The performance of using less aggregated...
Show moreCrash frequency analysis is the most critical tool to investigate traffic safety problems. Therefore, an accurate crash analysis must be conducted. Since traffic continually fluctuates over time and this effects potential of crash occurrence, shorter time periods and less aggregated traffic factors (shorter intervals than AADT) need to be used. In this dissertation, several methodologies have been conducted to elevate the accuracy of crash prediction. The performance of using less aggregated traffic data in modeling crash frequency was explored for weekdays and weekends. Four-time periods for weekdays and two time periods for weekends, with four intervals (5, 15, 30, and 60 minutes). The comparison between AADT based models and short-term period models showed that short-term period models perform better. As a shorter traffic interval than AADT considered, two difficulties began. Firstly, the number of zero observations increased. Secondly, the repetition of the same roadway characteristics arose. To reduce the number of zero observations, only segments with one or more crashes were used in the modeling process. To eliminate the effect of the repetition in the data, random effect was applied. The results recommend adopting segments with only one or more crashes, as they give a more valid prediction and less error.Zero-inflated negative binomial (ZINB) and hurdle negative binomial (HNB) models were examined in addition to the negative binomial for both weekdays and weekends. Different implementations of random effects were applied. Using the random effect either on the count part, on the zero part, or a pair of uncorrelated (or correlated) random effects for both parts of the model. Additionally, the adaptive Gaussian Quadrature, with five quadrature points, was used to increase accuracy. The results reveal that the model which considered the random effect in both parts performed better than other models, and ZINB performed better than HNB.
Show less - Date Issued
- 2018
- Identifier
- CFE0006966, ucf:51682
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006966