Current Search: Wu, Dazhong (x)
View All Items
- Title
- detecting anomalies from big data system logs.
- Creator
-
Lu, Siyang, Wang, Liqiang, Zhang, Shaojie, Zhang, Wei, Wu, Dazhong, University of Central Florida
- Abstract / Description
-
Nowadays, big data systems (e.g., Hadoop and Spark) are being widely adopted by many domains for offering effective data solutions, such as manufacturing, healthcare, education, and media. A common problem about big data systems is called anomaly, e.g., a status deviated from normal execution, which decreases the performance of computation or kills running programs. It is becoming a necessity to detect anomalies and analyze their causes. An effective and economical approach is to analyze...
Show moreNowadays, big data systems (e.g., Hadoop and Spark) are being widely adopted by many domains for offering effective data solutions, such as manufacturing, healthcare, education, and media. A common problem about big data systems is called anomaly, e.g., a status deviated from normal execution, which decreases the performance of computation or kills running programs. It is becoming a necessity to detect anomalies and analyze their causes. An effective and economical approach is to analyze system logs. Big data systems produce numerous unstructured logs that contain buried valuable information. However manually detecting anomalies from system logs is a tedious and daunting task.This dissertation proposes four approaches that can accurately and automatically analyze anomalies from big data system logs without extra monitoring overhead. Moreover, to detect abnormal tasks in Spark logs and analyze root causes, we design a utility to conduct fault injection and collect logs from multiple compute nodes. (1) Our first method is a statistical-based approach that can locate those abnormal tasks and calculate the weights of factors for analyzing the root causes. In the experiment, four potential root causes are considered, i.e., CPU, memory, network, and disk I/O. The experimental results show that the proposed approach is accurate in detecting abnormal tasks as well as finding the root causes. (2) To give a more reasonable probability result and avoid ad-hoc factor weights calculating, we propose a neural network approach to analyze root causes of abnormal tasks. We leverage General Regression Neural Network (GRNN) to identify root causes for abnormal tasks. The likelihood of reported root causes is presented to users according to the weighted factors by GRNN. (3) To further improve anomaly detection by avoiding feature extraction, we propose a novel approach by leveraging Convolutional Neural Networks (CNN). Our proposed model can automatically learn event relationships in system logs and detect anomaly with high accuracy. Our deep neural network consists of logkey2vec embeddings, three 1D convolutional layers, a dropout layer, and max pooling. According to our experiment, our CNN-based approach has better accuracy compared to other approaches using Long Short-Term Memory (LSTM) and Multilayer Perceptron (MLP) on detecting anomaly in Hadoop DistributedFile System (HDFS) logs. (4) To analyze system logs more accurately, we extend our CNN-based approach with two attention schemes to detect anomalies in system logs. The proposed two attention schemes focus on different features from CNN's output. We evaluate our approaches with several benchmarks, and the attention-based CNN model shows the best performance among all state-of-the-art methods.
Show less - Date Issued
- 2019
- Identifier
- CFE0007673, ucf:52499
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007673
- Title
- Color-Ratio Based Strawberry Plant Localization and Nutrition Deficiency Detection.
- Creator
-
Kong, Xiangling, Xu, Yunjun, Elgohary, Tarek, Fu, Qiushi, Wu, Dazhong, Wang, Liqiang, University of Central Florida
- Abstract / Description
-
In recent years, precision agriculture has become popular anticipating to partially meet the needs of an ever-growing population with limited resources. Plant localization and nutrient de?ciency detection are two important tasks in precision agriculture. In this dissertation, these two tasks are studied by using a new color-ratio(C-R) index technique. Firstly, a low cost and light scene invariant approach is proposed to detect green and yellow leaves based on the color-ratio (C-R) indices. A...
Show moreIn recent years, precision agriculture has become popular anticipating to partially meet the needs of an ever-growing population with limited resources. Plant localization and nutrient de?ciency detection are two important tasks in precision agriculture. In this dissertation, these two tasks are studied by using a new color-ratio(C-R) index technique. Firstly, a low cost and light scene invariant approach is proposed to detect green and yellow leaves based on the color-ratio (C-R) indices. A plant localization approach is then developed using the relative pixel relationships of adjacent plants. Secondly, the Sobel operator and morphology techniques are applied to segment the target strawberry leaf from a ?eld image. The characterized color for a speci?c nutrient de?ciency is detected by the C-R indices. The pattern of the detected color on the leaf is then examined to determine the speci?c nutrient de?ciency. The proposed approaches are validated in a commercial strawberry farm.
Show less - Date Issued
- 2019
- Identifier
- CFE0007666, ucf:52482
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007666