Current Search: anomaly (x)
-
-
Title
-
NETWORK INTRUSION DETECTION: MONITORING, SIMULATION ANDVISUALIZATION.
-
Creator
-
Zhou, Mian, Lang, Sheau-Dong, University of Central Florida
-
Abstract / Description
-
This dissertation presents our work on network intrusion detection and intrusion sim- ulation. The work in intrusion detection consists of two different network anomaly-based approaches. The work in intrusion simulation introduces a model using explicit traffic gen- eration for the packet level traffic simulation. The process of anomaly detection is to first build profiles for the normal network activity and then mark any events or activities that deviate from the normal profiles as...
Show moreThis dissertation presents our work on network intrusion detection and intrusion sim- ulation. The work in intrusion detection consists of two different network anomaly-based approaches. The work in intrusion simulation introduces a model using explicit traffic gen- eration for the packet level traffic simulation. The process of anomaly detection is to first build profiles for the normal network activity and then mark any events or activities that deviate from the normal profiles as suspicious. Based on the different schemes of creating the normal activity profiles, we introduce two approaches for intrusion detection. The first one is a frequency-based approach which creates a normal frequency profile based on the periodical patterns existed in the time-series formed by the traffic. It aims at those attacks that are conducted by running pre-written scripts, which automate the process of attempting connections to various ports or sending packets with fabricated payloads, etc. The second approach builds the normal profile based on variations of connection-based behavior of each single computer. The deviations resulted from each individual computer are carried out by a weight assignment scheme and further used to build a weighted link graph representing the overall traffic abnormalities. The functionality of this system is of a distributed personal IDS system that also provides a centralized traffic analysis by graphical visualization. It provides a finer control over the internal network by focusing on connection-based behavior of each single computer. For network intrusion simulation, we explore an alternative method for network traffic simulation using explicit traffic generation. In particular, we build a model to replay the standard DARPA traffic data or the traffic data captured from a real environment. The replayed traffic data is mixed with the attacks, such as DOS and Probe attack, which can create apparent abnormal traffic flow patterns. With the explicit traffic generation, every packet that has ever been sent by the victim and attacker is formed in the simulation model and travels around strictly following the criteria of time and path that extracted from the real scenario. Thus, the model provides a promising aid in the study of intrusion detection techniques.
Show less
-
Date Issued
-
2005
-
Identifier
-
CFE0000679, ucf:46484
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0000679
-
-
Title
-
SESSION-BASED INTRUSION DETECTION SYSTEM TO MAP ANOMALOUS NETWORK TRAFFIC.
-
Creator
-
Caulkins, Bruce, Wang, Morgan, University of Central Florida
-
Abstract / Description
-
Computer crime is a large problem (CSI, 2004; Kabay, 2001a; Kabay, 2001b). Security managers have a variety of tools at their disposal firewalls, Intrusion Detection Systems (IDSs), encryption, authentication, and other hardware and software solutions to combat computer crime. Many IDS variants exist which allow security managers and engineers to identify attack network packets primarily through the use of signature detection; i.e., the IDS recognizes attack packets due to their well...
Show moreComputer crime is a large problem (CSI, 2004; Kabay, 2001a; Kabay, 2001b). Security managers have a variety of tools at their disposal firewalls, Intrusion Detection Systems (IDSs), encryption, authentication, and other hardware and software solutions to combat computer crime. Many IDS variants exist which allow security managers and engineers to identify attack network packets primarily through the use of signature detection; i.e., the IDS recognizes attack packets due to their well-known "fingerprints" or signatures as those packets cross the network's gateway threshold. On the other hand, anomaly-based ID systems determine what is normal traffic within a network and reports abnormal traffic behavior. This paper will describe a methodology towards developing a more-robust Intrusion Detection System through the use of data-mining techniques and anomaly detection. These data-mining techniques will dynamically model what a normal network should look like and reduce the false positive and false negative alarm rates in the process. We will use classification-tree techniques to accurately predict probable attack sessions. Overall, our goal is to model network traffic into network sessions and identify those network sessions that have a high-probability of being an attack and can be labeled as a "suspect session." Subsequently, we will use these techniques inclusive of signature detection methods, as they will be used in concert with known signatures and patterns in order to present a better model for detection and protection of networks and systems.
Show less
-
Date Issued
-
2005
-
Identifier
-
CFE0000906, ucf:46762
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0000906
-
-
Title
-
detecting anomalies from big data system logs.
-
Creator
-
Lu, Siyang, Wang, Liqiang, Zhang, Shaojie, Zhang, Wei, Wu, Dazhong, University of Central Florida
-
Abstract / Description
-
Nowadays, big data systems (e.g., Hadoop and Spark) are being widely adopted by many domains for offering effective data solutions, such as manufacturing, healthcare, education, and media. A common problem about big data systems is called anomaly, e.g., a status deviated from normal execution, which decreases the performance of computation or kills running programs. It is becoming a necessity to detect anomalies and analyze their causes. An effective and economical approach is to analyze...
Show moreNowadays, big data systems (e.g., Hadoop and Spark) are being widely adopted by many domains for offering effective data solutions, such as manufacturing, healthcare, education, and media. A common problem about big data systems is called anomaly, e.g., a status deviated from normal execution, which decreases the performance of computation or kills running programs. It is becoming a necessity to detect anomalies and analyze their causes. An effective and economical approach is to analyze system logs. Big data systems produce numerous unstructured logs that contain buried valuable information. However manually detecting anomalies from system logs is a tedious and daunting task.This dissertation proposes four approaches that can accurately and automatically analyze anomalies from big data system logs without extra monitoring overhead. Moreover, to detect abnormal tasks in Spark logs and analyze root causes, we design a utility to conduct fault injection and collect logs from multiple compute nodes. (1) Our first method is a statistical-based approach that can locate those abnormal tasks and calculate the weights of factors for analyzing the root causes. In the experiment, four potential root causes are considered, i.e., CPU, memory, network, and disk I/O. The experimental results show that the proposed approach is accurate in detecting abnormal tasks as well as finding the root causes. (2) To give a more reasonable probability result and avoid ad-hoc factor weights calculating, we propose a neural network approach to analyze root causes of abnormal tasks. We leverage General Regression Neural Network (GRNN) to identify root causes for abnormal tasks. The likelihood of reported root causes is presented to users according to the weighted factors by GRNN. (3) To further improve anomaly detection by avoiding feature extraction, we propose a novel approach by leveraging Convolutional Neural Networks (CNN). Our proposed model can automatically learn event relationships in system logs and detect anomaly with high accuracy. Our deep neural network consists of logkey2vec embeddings, three 1D convolutional layers, a dropout layer, and max pooling. According to our experiment, our CNN-based approach has better accuracy compared to other approaches using Long Short-Term Memory (LSTM) and Multilayer Perceptron (MLP) on detecting anomaly in Hadoop DistributedFile System (HDFS) logs. (4) To analyze system logs more accurately, we extend our CNN-based approach with two attention schemes to detect anomalies in system logs. The proposed two attention schemes focus on different features from CNN's output. We evaluate our approaches with several benchmarks, and the attention-based CNN model shows the best performance among all state-of-the-art methods.
Show less
-
Date Issued
-
2019
-
Identifier
-
CFE0007673, ucf:52499
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0007673
-
-
Title
-
THEORETICAL TAILORING OF PERFORATED THIN SILVER FILMS FOR AFFINITY SURFACE PLASMON RESONANCE BIOSENSOR APPLICATIONS.
-
Creator
-
Gongora Jr., Renan, Zou, Shengli, University of Central Florida
-
Abstract / Description
-
Metallic films, in conjunction with biochemical-targeted probes, are expected to provide early diagnosis, targeted therapy and non-invasive monitoring for epidemiology applications. The resonance wavelength peaks, both plasmonic and Wood-Rayleigh Anomalies (WRAs), in the scattering spectra are affected by the metallic architecture. As of today, much research has been devoted to extinction efficiency in the plasmonic region. However, Wood Rayleigh Anomalies (WRAs) typically occur at...
Show moreMetallic films, in conjunction with biochemical-targeted probes, are expected to provide early diagnosis, targeted therapy and non-invasive monitoring for epidemiology applications. The resonance wavelength peaks, both plasmonic and Wood-Rayleigh Anomalies (WRAs), in the scattering spectra are affected by the metallic architecture. As of today, much research has been devoted to extinction efficiency in the plasmonic region. However, Wood Rayleigh Anomalies (WRAs) typically occur at wavelengths associated with the periodic distance of the structures. A significant number of papers have already focused on the plasmonic region of the visible spectrum, but a less explored area of research was presented here; the desired resonance wavelength region was 400-500nm, corresponding to the WRA for the silver film with perforated hole with a periodic distance of 400nm. Simulations obtained from the discrete dipole approximation (DDA) method, show sharp spectral bands (either high or low scattering efficiencies) in both wavelength regions of the visible spectrum simulated from Ag film with cylindrical hole arrays In addition, surprising results were obtained in the parallel scattering spectra,where the electric field is contained in the XY plane, when the angle between the metallic surface and the incident light was adjusted to 14 degrees; a bathochromic shift was observed for the WRA peak suggesting a hybrid resonance mode. Metallic films have the potential to be used in instrumental techniques for use as sensors, i.e. surface plasmon resonance affinity biosensors, but are not limited to such instrumental techniques. Although the research here was aimed towards affinity biosensors, other sensory designs can benefit from the optimized Ag film motifs. The intent of the study was to elucidate metal film motifs, when incorporated into instrumental analysis, allowing the quantification of genetic material in the visible region. Any research group that routinely benefits from quantification of various analytes in solution matrices will also benefit from this study, as there are a bewildering number of instrumental sensory methods and setups available.
Show less
-
Date Issued
-
2014
-
Identifier
-
CFH0004538, ucf:45155
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFH0004538
-
-
Title
-
SCALABLE AND EFFICIENT OUTLIER DETECTION IN LARGE DISTRIBUTED DATA SETS WITH MIXED-TYPE ATTRIBUTES.
-
Creator
-
Koufakou, Anna, Georgiopoulos, Michael, University of Central Florida
-
Abstract / Description
-
An important problem that appears often when analyzing data involves identifying irregular or abnormal data points called outliers. This problem broadly arises under two scenarios: when outliers are to be removed from the data before analysis, and when useful information or knowledge can be extracted by the outliers themselves. Outlier Detection in the context of the second scenario is a research field that has attracted significant attention in a broad range of useful applications. For...
Show moreAn important problem that appears often when analyzing data involves identifying irregular or abnormal data points called outliers. This problem broadly arises under two scenarios: when outliers are to be removed from the data before analysis, and when useful information or knowledge can be extracted by the outliers themselves. Outlier Detection in the context of the second scenario is a research field that has attracted significant attention in a broad range of useful applications. For example, in credit card transaction data, outliers might indicate potential fraud; in network traffic data, outliers might represent potential intrusion attempts. The basis of deciding if a data point is an outlier is often some measure or notion of dissimilarity between the data point under consideration and the rest. Traditional outlier detection methods assume numerical or ordinal data, and compute pair-wise distances between data points. However, the notion of distance or similarity for categorical data is more difficult to define. Moreover, the size of currently available data sets dictates the need for fast and scalable outlier detection methods, thus precluding distance computations. Additionally, these methods must be applicable to data which might be distributed among different locations. In this work, we propose novel strategies to efficiently deal with large distributed data containing mixed-type attributes. Specifically, we first propose a fast and scalable algorithm for categorical data (AVF), and its parallel version based on MapReduce (MR-AVF). We extend AVF and introduce a fast outlier detection algorithm for large distributed data with mixed-type attributes (ODMAD). Finally, we modify ODMAD in order to deal with very high-dimensional categorical data. Experiments with large real-world and synthetic data show that the proposed methods exhibit large performance gains and high scalability compared to the state-of-the-art, while achieving similar accuracy detection rates.
Show less
-
Date Issued
-
2009
-
Identifier
-
CFE0002734, ucf:48161
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0002734
-
-
Title
-
MODELING SCENES AND HUMAN ACTIVITIES IN VIDEOS.
-
Creator
-
Basharat, Arslan, Shah, Mubarak, University of Central Florida
-
Abstract / Description
-
In this dissertation, we address the problem of understanding human activities in videos by developing a two-pronged approach: coarse level modeling of scene activities and fine level modeling of individual activities. At the coarse level, where the resolution of the video is low, we rely on person tracks. At the fine level, richer features are available to identify different parts of the human body, therefore we rely on the body joint tracks. There are three main goals of this dissertation: ...
Show moreIn this dissertation, we address the problem of understanding human activities in videos by developing a two-pronged approach: coarse level modeling of scene activities and fine level modeling of individual activities. At the coarse level, where the resolution of the video is low, we rely on person tracks. At the fine level, richer features are available to identify different parts of the human body, therefore we rely on the body joint tracks. There are three main goals of this dissertation: (1) identify unusual activities at the coarse level, (2) recognize different activities at the fine level, and (3) predict the behavior for synthesizing and tracking activities at the fine level. The first goal is addressed by modeling activities at the coarse level through two novel and complementing approaches. The first approach learns the behavior of individuals by capturing the patterns of motion and size of objects in a compact model. Probability density function (pdf) at each pixel is modeled as a multivariate Gaussian Mixture Model (GMM), which is learnt using unsupervised expectation maximization (EM). In contrast, the second approach learns the interaction of object pairs concurrently present in the scene. This can be useful in detecting more complex activities than those modeled by the first approach. We use a 14-dimensional Kernel Density Estimation (KDE) that captures motion and size of concurrently tracked objects. The proposed models have been successfully used to automatically detect activities like unusual person drop-off and pickup, jaywalking, etc. The second and third goals of modeling human activities at the fine level are addressed by employing concepts from theory of chaos and non-linear dynamical systems. We show that the proposed model is useful for recognition and prediction of the underlying dynamics of human activities. We treat the trajectories of human body joints as the observed time series generated from an underlying dynamical system. The observed data is used to reconstruct a phase (or state) space of appropriate dimension by employing the delay-embedding technique. This transformation is performed without assuming an exact model of the underlying dynamics and provides a characteristic representation that will prove to be vital for recognition and prediction tasks. For recognition, properties of phase space are captured in terms of dynamical and metric invariants, which include the Lyapunov exponent, correlation integral, and correlation dimension. A composite feature vector containing these invariants represents the action and will be used for classification. For prediction, kernel regression is used in the phase space to compute predictions with a specified initial condition. This approach has the advantage of modeling dynamics without making any assumptions about the exact form (polynomial, radial basis, etc.) of the mapping function. We demonstrate the utility of these predictions for human activity synthesis and tracking.
Show less
-
Date Issued
-
2009
-
Identifier
-
CFE0002897, ucf:48042
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0002897
-
-
Title
-
PATTERNS OF MOTION: DISCOVERY AND GENERALIZED REPRESENTATION.
-
Creator
-
Saleemi, Imran, Shah, Mubarak, University of Central Florida
-
Abstract / Description
-
In this dissertation, we address the problem of discovery and representation of motion patterns in a variety of scenarios, commonly encountered in vision applications. The overarching goal is to devise a generic representation, that captures any kind of object motion observable in video sequences. Such motion is a significant source of information typically employed for diverse applications such as tracking, anomaly detection, and action and event recognition. We present statistical...
Show moreIn this dissertation, we address the problem of discovery and representation of motion patterns in a variety of scenarios, commonly encountered in vision applications. The overarching goal is to devise a generic representation, that captures any kind of object motion observable in video sequences. Such motion is a significant source of information typically employed for diverse applications such as tracking, anomaly detection, and action and event recognition. We present statistical frameworks for representation of motion characteristics of objects, learned from tracks or optical flow, for static as well as moving cameras, and propose algorithms for their application to a variety of problems. The proposed motion pattern models and learning methods are general enough to be employed in a variety of problems as we demonstrate experimentally. We first propose a novel method to model and learn the scene activity, observed by a static camera. The motion patterns of objects in the scene are modeled in the form of a multivariate non-parametric probability density function of spatiotemporal variables (object locations and transition times between them). Kernel Density Estimation (KDE) is used to learn this model in a completely unsupervised fashion. Learning is accomplished by observing the trajectories of objects by a static camera over extended periods of time. The model encodes the probabilistic nature of the behavior of moving objects in the scene and is useful for activity analysis applications, such as persistent tracking and anomalous motion detection. In addition, the model also captures salient scene features, such as, the areas of occlusion and most likely paths. Once the model is learned, we use a unified Markov Chain Monte-Carlo (MCMC) based framework for generating the most likely paths in the scene, improving foreground detection, persistent labelling of objects during tracking and deciding whether a given trajectory represents an anomaly to the observed motion patterns. Experiments with real world videos are reported which validate the proposed approach. The representation and estimation framework proposed above, however, has a few limitations. This algorithm proposes to use a single global statistical distribution to represent all kinds of motion observed in a particular scene. It therefore, does not find a separation between multiple semantically distinct motion patterns in the scene. Instead, the learned model is a joint distribution over all possible patterns followed by objects. To overcome this limitation, we then propose a superior method for the discovery and statistical representation of motion patterns in a scene. The advantages of this approach over the first one are two-fold: first, this model is applicable to scenes of dense crowded motion where tracking may not be feasible, and second, it distinguishes between motion patterns that are distinct at a semantic level of abstraction. We propose a mixture model representation of salient patterns of optical flow, and present an algorithm for learning these patterns from dense optical flow in a hierarchical, unsupervised fashion. Using low level cues of noisy optical flow, K-means is employed to initialize a Gaussian mixture model for temporally segmented clips of video. The components of this mixture are then filtered and instances of motion patterns are computed using a simple motion model, by linking components across space and time. Motion patterns are then initialized and membership of instances in different motion patterns is established by using KL divergence between mixture distributions of pattern instances. Finally, a pixel level representation of motion patterns is proposed by deriving conditional expectation of optical flow. Results of extensive experiments are presented for multiple surveillance sequences containing numerous patterns involving both pedestrian and vehicular traffic. The proposed method exploits optical flow as the low level feature and performs a hierarchical clustering to obtain motion patterns; and we observe that the use of optical flow is also an integral part of a variety of other vision applications, for example, as features based representation of human actions. We, therefore, propose a new representation for articulated human actions using the motion patterns. The representation is based on hierarchical clustering of observed optical flow in four dimensional, spatial and motion flow space. The automatically discovered motion patterns, are the primitive actions, representative of flow at salient regions on the human body, much like trajectories of body joints, which are notoriously difficult to obtain automatically. The proposed method works in a completely unsupervised fashion, and in sharp contrast to state of the art representations like bag of video words, provides a truly semantically meaningful representation. Each primitive action depicts the most atomic sub-action, like left arm moving upwards, or right leg moving downward and leftward, and is represented by a mixture of four dimensional Gaussian distributions. A sequence of primitive actions are discovered in the test video, and labelled by computing the KL divergence between mixtures. The entire video sequence containing the human action, is thus reduced to a simple string, which is matched against similar strings of training videos to classify the action. The string matching is performed by global alignment, using the well-known Needleman-Wunsch algorithm. Experiments reported on multiple human actions data sets, confirm the validity, simplicity, and semantically meaningful nature of the proposed representation. Results obtained are encouraging and comparable to the state of the art.
Show less
-
Date Issued
-
2011
-
Identifier
-
CFE0003646, ucf:48836
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0003646