Current Search: Principal Component Analysis (x)
View All Items
- Title
- Non-Destructive Analysis of Trace Textile Fiber Evidence via Room-Temperature Fluorescence Spectrocopy.
- Creator
-
Appalaneni, Krishnaveni, Campiglia, Andres, Belfield, Kevin, Sigman, Michael, Yestrebsky, Cherie, Schulte, Alfons, University of Central Florida
- Abstract / Description
-
Forensic fiber evidence plays an important role in many criminal investigations. Non-destructive techniques that preserve the physical integrity of the fibers for further court examination are highly valuable in forensic science. Non-destructive techniques that can either discriminate between similar fibers or match a known to a questioned fiber - and still preserve the physical integrity of the fibers for further court examination - are highly valuable in forensic science. When fibers cannot...
Show moreForensic fiber evidence plays an important role in many criminal investigations. Non-destructive techniques that preserve the physical integrity of the fibers for further court examination are highly valuable in forensic science. Non-destructive techniques that can either discriminate between similar fibers or match a known to a questioned fiber - and still preserve the physical integrity of the fibers for further court examination - are highly valuable in forensic science. When fibers cannot be discriminated by non-destructive tests, the next reasonable step is to extract the questioned and known fibers for dye analysis with a more selective technique such as high-performance liquid chromatography (HPLC) and/or gas chromatography-mass spectrometry (GC-MS). The common denominator among chromatographic techniques is to primarily focus on the dyes used to color the fibers and do not investigate other potential discriminating components present on the fiber. Differentiating among commercial dyes with very similar chromatographic behaviors and almost identical absorption spectra and/or fragmentation patterns is a challenging task.This dissertation explores a different aspect of fiber analysis as it focuses on the total fluorescence emission of fibers. In addition to the contribution of the textile dye (or dyes) to the fluorescence spectrum of the fiber, we investigate the contribution of intrinsic fluorescence impurities (-) i.e. impurities imbedded into the fibers during fabrication of garments - as a reproducible source of fiber comparison. Differentiation of visually indistinguishable fibers is achieved by comparing excitation-emission matrices (EEMs) recorded from single textile fibers with the aid of a commercial spectrofluorimeter coupled to an epi-fluorescence microscope. Statistical data comparison was carried out via principal component analysis. An application of this statistical approach is demonstrated using challenging dyes with similarities both in two-dimensional absorbance spectra and in three dimensional EEM data. High accuracy of fiber identification was observed in all the cases and no false positive identifications were observed at 99% confidence levels.
Show less - Date Issued
- 2013
- Identifier
- CFE0004808, ucf:49740
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0004808
- Title
- Selective Multivariate Applications in Forensic Science.
- Creator
-
Rinke, Caitlin, Sigman, Michael, Campiglia, Andres, Yestrebsky, Cherie, Kuebler, Stephen, Richardson, Martin, University of Central Florida
- Abstract / Description
-
A 2009 report published by the National Research Council addressed the need for improvements in the field of forensic science. In the report emphasis was placed on the need for more rigorous scientific analysis within many forensic science disciplines and for established limitations and determination of error rates from statistical analysis. This research focused on multivariate statistical techniques for the analysis of spectral data obtained for multiple forensic applications which include...
Show moreA 2009 report published by the National Research Council addressed the need for improvements in the field of forensic science. In the report emphasis was placed on the need for more rigorous scientific analysis within many forensic science disciplines and for established limitations and determination of error rates from statistical analysis. This research focused on multivariate statistical techniques for the analysis of spectral data obtained for multiple forensic applications which include samples from: automobile float glasses and paints, bones, metal transfers, ignitable liquids and fire debris, and organic compounds including explosives. The statistical techniques were used for two types of data analysis: classification and discrimination. Statistical methods including linear discriminant analysis and a novel soft classification method were used to provide classification of forensic samples based on a compiled library. The novel soft classification method combined three statistical steps: Principal Component Analysis (PCA), Target Factor Analysis (TFA), and Bayesian Decision Theory (BDT) to provide classification based on posterior probabilities of class membership. The posterior probabilities provide a statistical probability of classification which can aid a forensic analyst in reaching a conclusion. The second analytical approach applied nonparametric methods to provide the means for discrimination between samples. Nonparametric methods are performed as hypothesis test and do not assume normal distribution of the analytical figures of merit. The nonparametric permutation test was applied to forensic applications to determine the similarity between two samples and provide discrimination rates. Both the classification method and discrimination method were applied to data acquired from multiple instrumental methods. The instrumental methods included: Laser Induced-Breakdown Spectroscopy (LIBS), Fourier Transform Infrared Spectroscopy (FTIR), Raman spectroscopy, and Gas Chromatography-Mass Spectrometry (GC-MS). Some of these instrumental methods are currently applied to forensic applications, such as GC-MS for the analysis of ignitable liquid and fire debris samples; while others provide new instrumental methods to areas within forensic science which currently lack instrumental analysis techniques, such as LIBS for the analysis of metal transfers. The combination of the instrumental techniques and multivariate statistical techniques is investigated in new approaches to forensic applications in this research to assist in improving the field of forensic science.
Show less - Date Issued
- 2012
- Identifier
- CFE0004628, ucf:49942
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0004628
- Title
- CHARACTERIZATION OF NOVEL ANTIMALARIALS FROM COMPOUNDS INSPIRED BY NATURAL PRODUCTS USING PRINCIPAL COMPONENT ANALYSIS (PCA).
- Creator
-
Balde, Zarina Marie G, Chakrabarti, Debopam, University of Central Florida
- Abstract / Description
-
Malaria is caused by a protozoan parasite, Plasmodium falciparum, which is responsible for over 500,000 deaths per year worldwide. Although malaria medicines are working well in many parts of the world, antimalarial drug resistance has emerged as one of the greatest challenges facing malaria control today. Since the malaria parasites are once again developing widespread resistance to antimalarial drugs, this can cause the spread of malaria to new areas and the re-emergence of malaria in areas...
Show moreMalaria is caused by a protozoan parasite, Plasmodium falciparum, which is responsible for over 500,000 deaths per year worldwide. Although malaria medicines are working well in many parts of the world, antimalarial drug resistance has emerged as one of the greatest challenges facing malaria control today. Since the malaria parasites are once again developing widespread resistance to antimalarial drugs, this can cause the spread of malaria to new areas and the re-emergence of malaria in areas where it had already been eradicated. Therefore, the discovery and characterization of novel antimalarials is extremely urgent. A previous drug screen in Dr. Chakrabarti's lab identified several natural products (NPs) with antiplasmodial activities. The focus of this study is to characterize the hit compounds using Principal Component Analysis (PCA) to determine structural uniqueness compared to known antimalarial drugs. This study will compare multiple libraries of different compounds, such as known drugs, kinase inhibitors, macrocycles, and top antimalarial hits discovered in our lab. Prioritizing the hit compounds by their chemical uniqueness will lessen the probability of future drug resistance. This is an important step in drug discovery as this will allow us to increase the interpretability of the datasets by creating new uncorrelated variables that will successively maximize variance. Characterization of the Natural Product inspired compounds will enable us to discover potent, selective, and novel antiplasmodial scaffolds that are unique in the 3-dimensional chemical space and will provide critical information that will serve as advanced starting points for the antimalarial drug discovery pipeline.
Show less - Date Issued
- 2018
- Identifier
- CFH2000405, ucf:45893
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFH2000405
- Title
- ASSESSING CRASH OCCURRENCE ON URBAN FREEWAYS USING STATIC AND DYNAMIC FACTORS BY APPLYING A SYSTEM OF INTERRELATED EQUATIONS.
- Creator
-
Pemmanaboina, Rajashekar, Abdel-Aty, Mohamed, University of Central Florida
- Abstract / Description
-
Traffic crashes have been identified as one of the main causes of death in the US, making road safety a high priority issue that needs urgent attention. Recognizing the fact that more and effective research has to be done in this area, this thesis aims mainly at developing different statistical models related to the road safety. The thesis includes three main sections: 1) overall crash frequency analysis using negative binomial models, 2) seemingly unrelated negative binomial (SUNB) models...
Show moreTraffic crashes have been identified as one of the main causes of death in the US, making road safety a high priority issue that needs urgent attention. Recognizing the fact that more and effective research has to be done in this area, this thesis aims mainly at developing different statistical models related to the road safety. The thesis includes three main sections: 1) overall crash frequency analysis using negative binomial models, 2) seemingly unrelated negative binomial (SUNB) models for different categories of crashes divided based on type of crash, or condition in which they occur, 3) safety models to determine the probability of crash occurrence, including a rainfall index that has been estimated using a logistic regression model. The study corridor is a 36.25 mile stretch of Interstate 4 in Central Florida. For the first two sections, crash cases from 1999 through 2002 were considered. Conventionally most of the crash frequency analysis model all crashes, instead of dividing them based on type of crash, peaking conditions, availability of light, severity, or pavement condition, etc. Also researchers traditionally used AADT to represent traffic volumes in their models. These two cases are examples of macroscopic crash frequency modeling. To investigate the microscopic models, and to identify the significant factors related to crash occurrence, a preliminary study (first analysis) explored the use of microscopic traffic volumes related to crash occurrence by comparing AADT/VMT with five to twenty minute volumes immediately preceding the crash. It was found that the volumes just before the time of crash occurrence proved to be a better predictor of crash frequency than AADT. The results also showed that road curvature, median type, number of lanes, pavement surface type and presence of on/off-ramps are among the significant factors that contribute to crash occurrence. In the second analysis various possible crash categories were prepared to exactly identify the factors related to them, using various roadway, geometric, and microscopic traffic variables. Five different categories are prepared based on a common platform, e.g. type of crash. They are: 1) Multiple and Single vehicle crashes, 2) Peak and Off-peak crashes, 3) Dry and Wet pavement crashes, 4) Daytime and Dark hour crashes, and 5) Property Damage Only (PDO) and Injury crashes. Each of the above mentioned models in each category are estimated separately. To account for the correlation between the disturbance terms arising from omitted variables between any two models in a category, seemingly unrelated negative binomial (SUNB) regression was used, and then the models in each category were estimated simultaneously. SUNB estimation proved to be advantageous for two categories: Category 1, and Category 4. Road curvature and presence of On-ramps/Off-ramps were found to be the important factors, which can be related to every crash category. AADT was also found to be significant in all the models except for the single vehicle crash model. Median type and pavement surface type were among the other important factors causing crashes. It can be stated that the group of factors found in the model considering all crashes is a superset of the factors that were found in individual crash categories. The third analysis dealt with the development of a logistic regression model to obtain the weather condition at a given time and location on I-4 in Central Florida so that this information can be used in traffic safety analyses, because of the lack of weather monitoring stations in the study area. To prove the worthiness of the weather information obtained form the analysis, the same weather information was used in a safety model developed by Abdel-Aty et al., 2004. It was also proved that the inclusion of weather information actually improved the safety model with better prediction accuracy.
Show less - Date Issued
- 2005
- Identifier
- CFE0000587, ucf:46468
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0000587
- Title
- Analysis of Remote Tripping Command Injection Attacks in Industrial Control Systems Through Statistical and Machine Learning Methods.
- Creator
-
Timm, Charles, Caulkins, Bruce, Wiegand, Rudolf, Lathrop, Scott, University of Central Florida
- Abstract / Description
-
In the past decade, cyber operations have been increasingly utilized to further policy goals of state-sponsored actors to shift the balance of politics and power on a global scale. One of the ways this has been evidenced is through the exploitation of electric grids via cyber means. A remote tripping command injection attack is one of the types of attacks that could have devastating effects on the North American power grid. To better understand these attacks and create detection axioms to...
Show moreIn the past decade, cyber operations have been increasingly utilized to further policy goals of state-sponsored actors to shift the balance of politics and power on a global scale. One of the ways this has been evidenced is through the exploitation of electric grids via cyber means. A remote tripping command injection attack is one of the types of attacks that could have devastating effects on the North American power grid. To better understand these attacks and create detection axioms to both quickly identify and mitigate the effects of a remote tripping command injection attack, a dataset comprised of 128 variables (primarily synchrophasor measurements) was analyzed via statistical methods and machine learning algorithms in RStudio and WEKA software respectively. While statistical methods were not successful due to the non-linearity and complexity of the dataset, machine learning algorithms surpassed accuracy metrics established in previous research given a simplified dataset of the specified attack and normal operational data. This research allows future cybersecurity researchers to better understand remote tripping command injection attacks in comparison to normal operational conditions. Further, an incorporation of the analysis has the potential to increase detection and thus mitigate risk to the North American power grid in future work.
Show less - Date Issued
- 2018
- Identifier
- CFE0007257, ucf:52193
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007257
- Title
- LASER INDUCED BREAKDOWN SPECTROSCOPY FOR DETECTION OF ORGANIC RESIDUES: IMPACT OF AMBIENT ATMOSPHERE AND LASER PARAMETERS.
- Creator
-
Brown, Christopher, Richardson, Martin, University of Central Florida
- Abstract / Description
-
Laser Induced Breakdown Spectroscopy (LIBS) is showing great potential as an atomic analytical technique. With its ability to rapidly analyze all forms of matter, with little-to-no sample preparation, LIBS has many advantages over conventional atomic emission spectroscopy techniques. With the maturation of the technologies that make LIBS possible, there has been a growing movement to implement LIBS in portable analyzers for field applications. In particular, LIBS has long been considered the...
Show moreLaser Induced Breakdown Spectroscopy (LIBS) is showing great potential as an atomic analytical technique. With its ability to rapidly analyze all forms of matter, with little-to-no sample preparation, LIBS has many advantages over conventional atomic emission spectroscopy techniques. With the maturation of the technologies that make LIBS possible, there has been a growing movement to implement LIBS in portable analyzers for field applications. In particular, LIBS has long been considered the front-runner in the drive for stand-off detection of trace deposits of explosives. Thus there is a need for a better understanding of the relevant processes that are responsible for the LIBS signature and their relationships to the different system parameters that are helping to improve LIBS as a sensing technology. This study explores the use of LIBS as a method to detect random trace amounts of specific organic materials deposited on organic or non-metallic surfaces. This requirement forces the limitation of single-shot signal analysis. This study is both experimental and theoretical, with a sizeable component addressing data analysis using principal components analysis to reduce the dimensionality of the data, and quadratic discriminant analysis to classify the data. In addition, the alternative approach of 'target factor analysis' was employed to improve detection of organic residues on organic substrates. Finally, a new method of characterizing the laser-induced plasma of organics, which should lead to improved data collection and analysis, is introduced. The comparison between modeled and experimental measurements of plasma temperatures and electronic density is discussed in order to improve the present models of low-temperature laser induced plasmas.
Show less - Date Issued
- 2011
- Identifier
- CFE0003708, ucf:48843
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0003708
- Title
- Weighted Low-Rank Approximation of Matrices:Some Analytical and Numerical Aspects.
- Creator
-
Dutta, Aritra, Li, Xin, Sun, Qiyu, Mohapatra, Ram, Nashed, M, Shah, Mubarak, University of Central Florida
- Abstract / Description
-
This dissertation addresses some analytical and numerical aspects of a problem of weighted low-rank approximation of matrices. We propose and solve two different versions of weighted low-rank approximation problems. We demonstrate, in addition, how these formulations can be efficiently used to solve some classic problems in computer vision. We also present the superior performance of our algorithms over the existing state-of-the-art unweighted and weighted low-rank approximation algorithms...
Show moreThis dissertation addresses some analytical and numerical aspects of a problem of weighted low-rank approximation of matrices. We propose and solve two different versions of weighted low-rank approximation problems. We demonstrate, in addition, how these formulations can be efficiently used to solve some classic problems in computer vision. We also present the superior performance of our algorithms over the existing state-of-the-art unweighted and weighted low-rank approximation algorithms.Classical principal component analysis (PCA) is constrained to have equal weighting on the elements of the matrix, which might lead to a degraded design in some problems. To address this fundamental flaw in PCA, Golub, Hoffman, and Stewart proposed and solved a problem of constrained low-rank approximation of matrices: For a given matrix $A = (A_1\;A_2)$, find a low rank matrix $X = (A_1\;X_2)$ such that ${\rm rank}(X)$ is less than $r$, a prescribed bound, and $\|A-X\|$ is small.~Motivated by the above formulation, we propose a weighted low-rank approximation problem that generalizes the constrained low-rank approximation problem of Golub, Hoffman and Stewart.~We study a general framework obtained by pointwise multiplication with the weight matrix and consider the following problem:~For a given matrix $A\in\mathbb{R}^{m\times n}$ solve:\begin{eqnarray*}\label{weighted problem}\min_{\substack{X}}\|\left(A-X\right)\odot W\|_F^2~{\rm subject~to~}{\rm rank}(X)\le r,\end{eqnarray*}where $\odot$ denotes the pointwise multiplication and $\|\cdot\|_F$ is the Frobenius norm of matrices.In the first part, we study a special version of the above general weighted low-rank approximation problem.~Instead of using pointwise multiplication with the weight matrix, we use the regular matrix multiplication and replace the rank constraint by its convex surrogate, the nuclear norm, and consider the following problem:\begin{eqnarray*}\label{weighted problem 1}\hat{X} (&)=(&) \arg \min_X \{\frac{1}{2}\|(A-X)W\|_F^2 +\tau\|X\|_\ast\},\end{eqnarray*}where $\|\cdot\|_*$ denotes the nuclear norm of $X$.~Considering its resemblance with the classic singular value thresholding problem we call it the weighted singular value thresholding~(WSVT)~problem.~As expected,~the WSVT problem has no closed form analytical solution in general,~and a numerical procedure is needed to solve it.~We introduce auxiliary variables and apply simple and fast alternating direction method to solve WSVT numerically.~Moreover, we present a convergence analysis of the algorithm and propose a mechanism for estimating the weight from the data.~We demonstrate the performance of WSVT on two computer vision applications:~background estimation from video sequences~and facial shadow removal.~In both cases,~WSVT shows superior performance to all other models traditionally used. In the second part, we study the general framework of the proposed problem.~For the special case of weight, we study the limiting behavior of the solution to our problem,~both analytically and numerically.~In the limiting case of weights,~as $(W_1)_{ij}\to\infty, W_2=\mathbbm{1}$, a matrix of 1,~we show the solutions to our weighted problem converge, and the limit is the solution to the constrained low-rank approximation problem of Golub et. al. Additionally, by asymptotic analysis of the solution to our problem,~we propose a rate of convergence.~By doing this, we make explicit connections between a vast genre of weighted and unweighted low-rank approximation problems.~In addition to these, we devise a novel and efficient numerical algorithm based on the alternating direction method for the special case of weight and present a detailed convergence analysis.~Our approach improves substantially over the existing weighted low-rank approximation algorithms proposed in the literature.~Finally, we explore the use of our algorithm to real-world problems in a variety of domains, such as computer vision and machine learning. Finally, for a special family of weights, we demonstrate an interesting property of the solution to the general weighted low-rank approximation problem. Additionally, we devise two accelerated algorithms by using this property and present their effectiveness compared to the algorithm proposed in Chapter 4.
Show less - Date Issued
- 2016
- Identifier
- CFE0006833, ucf:51789
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006833
- Title
- Chemometric Applications to a Complex Classification Problem: Forensic Fire Debris Analysis.
- Creator
-
Waddell, Erin, Sigman, Michael, Belfield, Kevin, Campiglia, Andres, Yestrebsky, Cherie, Ni, Liqiang, University of Central Florida
- Abstract / Description
-
Fire debris analysis currently relies on visual pattern recognition of the total ion chromatograms, extracted ion profiles, and target compound chromatograms to identify the presence of an ignitable liquid according to the ASTM International E1618-10 standard method. For large data sets, this methodology can be time consuming and is a subjective method, the accuracy of which is dependent upon the skill and experience of the analyst. This research aimed to develop an automated classification...
Show moreFire debris analysis currently relies on visual pattern recognition of the total ion chromatograms, extracted ion profiles, and target compound chromatograms to identify the presence of an ignitable liquid according to the ASTM International E1618-10 standard method. For large data sets, this methodology can be time consuming and is a subjective method, the accuracy of which is dependent upon the skill and experience of the analyst. This research aimed to develop an automated classification method for large data sets and investigated the use of the total ion spectrum (TIS). The TIS is calculated by taking an average mass spectrum across the entire chromatographic range and has been shown to contain sufficient information content for the identification of ignitable liquids. The TIS of ignitable liquids and substrates, defined as common building materials and household furnishings, were compiled into model data sets. Cross-validation (CV) and fire debris samples, obtained from laboratory-scale and large-scale burns, were used to test the models. An automated classification method was developed using computational software, written in-house, that considers a multi-step classification scheme to detect ignitable liquid residues in fire debris samples and assign these to the classes defined in ASTM E1618-10. Classifications were made using linear discriminant analysis, quadratic discriminant analysis (QDA), and soft independent modeling of class analogy (SIMCA). Overall, the highest correct classification rates were achieved using QDA for the first step of the scheme and SIMCA for the remaining steps. In the first step of the classification scheme, correct classification rates of 95.3% and 89.2% were obtained for the CV test set and fire debris samples, respectively. Correct classifications rates of 100% were achieved for both data sets in the majority of the remaining steps which used SIMCA for classification. In this research, the first statistically valid error rates for fire debris analysis have been developed through cross-validation of large data sets. The error rates reduce the subjectivity associated with the current methods and provide a level of confidence in sample classification that does not currently exist in forensic fire debris analysis.
Show less - Date Issued
- 2013
- Identifier
- CFE0004954, ucf:49586
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0004954
- Title
- STATISTICAL ANALYSIS OF VISIBLE ABSORPTION SPECTRA AND MASS SPECTRA OBTAINED FROM DYED TEXTILE FIBERS.
- Creator
-
White, Katie, Sigman, Michael, University of Central Florida
- Abstract / Description
-
The National Academy of Sciences recently published a report which calls for improvements to the field of forensic science. Their report criticized many forensic disciplines for failure to establish rigorously-tested methods of comparison, and encouraged more research in these areas to establish limitations and assess error rates. This study applies chemometric and statistical methods to current and developing analytical techniques in fiber analysis. In addition to analysis of commercially...
Show moreThe National Academy of Sciences recently published a report which calls for improvements to the field of forensic science. Their report criticized many forensic disciplines for failure to establish rigorously-tested methods of comparison, and encouraged more research in these areas to establish limitations and assess error rates. This study applies chemometric and statistical methods to current and developing analytical techniques in fiber analysis. In addition to analysis of commercially available dyed textile fibers, two pairs of dyes are selected for custom fabric dyeing based on the similarities of their absorbance spectra and dye molecular structures. Visible absorption spectra for all fiber samples are collected using microspectrophotometry (MSP) and mass spectra are collected using electrospray ionization (ESI) mass spectrometry. Statistical calculations are performed using commercial software packages and software written in-house. Levels of Type I and Type II error are examined for fiber discrimination based on hypothesis testing of visible absorbance spectra profiles using a nonparametric permutation method. This work also explores evaluation of known and questioned fiber populations based on an assessment of statistical p-value distributions from questioned-known fiber comparisons with those of known fiber self-comparisons. Results from the hypothesis testing are compared with principal components analysis (PCA) and discriminant analysis (DA) of visible absorption spectra, as well as PCA and DA of ESI mass spectra. The sensitivity of a statistical approach will also be discussed in terms of how instrumental parameters and sampling methods may influence error rates.
Show less - Date Issued
- 2010
- Identifier
- CFE0003454, ucf:48396
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0003454
- Title
- Automatic Detection of Brain Functional Disorder Using Imaging Data.
- Creator
-
Dey, Soumyabrata, Shah, Mubarak, Jha, Sumit, Hu, Haiyan, Weeks, Arthur, Rao, Ravishankar, University of Central Florida
- Abstract / Description
-
Recently, Attention Deficit Hyperactive Disorder (ADHD) is getting a lot of attention mainly for two reasons. First, it is one of the most commonly found childhood behavioral disorders. Around 5-10% of the children all over the world are diagnosed with ADHD. Second, the root cause of the problem is still unknown and therefore no biological measure exists to diagnose ADHD. Instead, doctors need to diagnose it based on the clinical symptoms, such as inattention, impulsivity and hyperactivity,...
Show moreRecently, Attention Deficit Hyperactive Disorder (ADHD) is getting a lot of attention mainly for two reasons. First, it is one of the most commonly found childhood behavioral disorders. Around 5-10% of the children all over the world are diagnosed with ADHD. Second, the root cause of the problem is still unknown and therefore no biological measure exists to diagnose ADHD. Instead, doctors need to diagnose it based on the clinical symptoms, such as inattention, impulsivity and hyperactivity, which are all subjective.Functional Magnetic Resonance Imaging (fMRI) data has become a popular tool to understand the functioning of the brain such as identifying the brain regions responsible for different cognitive tasks or analyzing the statistical differences of the brain functioning between the diseased and control subjects. ADHD is also being studied using the fMRI data. In this dissertation we aim to solve the problem of automatic diagnosis of the ADHD subjects using their resting state fMRI (rs-fMRI) data.As a core step of our approach, we model the functions of a brain as a connectivity network, which is expected to capture the information about how synchronous different brain regions are in terms of their functional activities. The network is constructed by representing different brain regions as the nodes where any two nodes of the network are connected by an edge if the correlation of the activity patterns of the two nodes is higher than some threshold. The brain regions, represented as the nodes of the network, can be selected at different granularities e.g. single voxels or cluster of functionally homogeneous voxels. The topological differences of the constructed networks of the ADHD and control group of subjects are then exploited in the classification approach.We have developed a simple method employing the Bag-of-Words (BoW) framework for the classification of the ADHD subjects. We represent each node in the network by a 4-D feature vector: node degree and 3-D location. The 4-D vectors of all the network nodes of the training data are then grouped in a number of clusters using K-means; where each such cluster is termed as a word. Finally, each subject is represented by a histogram (bag) of such words. The Support Vector Machine (SVM) classifier is used for the detection of the ADHD subjects using their histogram representation. The method is able to achieve 64% classification accuracy.The above simple approach has several shortcomings. First, there is a loss of spatial information while constructing the histogram because it only counts the occurrences of words ignoring the spatial positions. Second, features from the whole brain are used for classification, but some of the brain regions may not contain any useful information and may only increase the feature dimensions and noise of the system. Third, in our study we used only one network feature, the degree of a node which measures the connectivity of the node, while other complex network features may be useful for solving the proposed problem.In order to address the above shortcomings, we hypothesize that only a subset of the nodes of the network possesses important information for the classification of the ADHD subjects. To identify the important nodes of the network we have developed a novel algorithm. The algorithm generates different random subset of nodes each time extracting the features from a subset to compute the feature vector and perform classification. The subsets are then ranked based on the classification accuracy and the occurrences of each node in the top ranked subsets are measured. Our algorithm selects the highly occurring nodes for the final classification. Furthermore, along with the node degree, we employ three more node features: network cycles, the varying distance degree and the edge weight sum. We concatenate the features of the selected nodes in a fixed order to preserve the relative spatial information. Experimental validation suggests that the use of the features from the nodes selected using our algorithm indeed help to improve the classification accuracy. Also, our finding is in concordance with the existing literature as the brain regions identified by our algorithms are independently found by many other studies on the ADHD. We achieved a classification accuracy of 69.59% using this approach. However, since this method represents each voxel as a node of the network which makes the number of nodes of the network several thousands. As a result, the network construction step becomes computationally very expensive. Another limitation of the approach is that the network features, which are computed for each node of the network, captures only the local structures while ignore the global structure of the network.Next, in order to capture the global structure of the networks, we use the Multi-Dimensional Scaling (MDS) technique to project all the subjects from an unknown network-space to a low dimensional space based on their inter-network distance measures. For the purpose of computing distance between two networks, we represent each node by a set of attributes such as the node degree, the average power, the physical location, the neighbor node degrees, and the average powers of the neighbor nodes. The nodes of the two networks are then mapped in such a way that for all pair of nodes, the sum of the attribute distances, which is the inter-network distance, is minimized. To reduce the network computation cost, we enforce that the maximum relevant information is preserved with minimum redundancy. To achieve this, the nodes of the network are constructed with clusters of highly active voxels while the activity levels of the voxels are measured based on the average power of their corresponding fMRI time-series. Our method shows promise as we achieve impressive classification accuracies (73.55%) on the ADHD-200 data set. Our results also reveal that the detection rates are higher when classification is performed separately on the male and female groups of subjects.So far, we have only used the fMRI data for solving the ADHD diagnosis problem. Finally, we investigated the answers of the following questions. Do the structural brain images contain useful information related to the ADHD diagnosis problem? Can the classification accuracy of the automatic diagnosis system be improved combining the information of the structural and functional brain data? Towards that end, we developed a new method to combine the information of structural and functional brain images in a late fusion framework. For structural data we input the gray matter (GM) brain images to a Convolutional Neural Network (CNN). The output of the CNN is a feature vector per subject which is used to train the SVM classifier. For the functional data we compute the average power of each voxel based on its fMRI time series. The average power of the fMRI time series of a voxel measures the activity level of the voxel. We found significant differences in the voxel power distribution patterns of the ADHD and control groups of subjects. The Local binary pattern (LBP) texture feature is used on the voxel power map to capture these differences. We achieved 74.23% accuracy using GM features, 77.30% using LBP features and 79.14% using combined information.In summary this dissertation demonstrated that the structural and functional brain imaging data are useful for the automatic detection of the ADHD subjects as we achieve impressive classification accuracies on the ADHD-200 data set. Our study also helps to identify the brain regions which are useful for ADHD subject classification. These findings can help in understanding the pathophysiology of the problem. Finally, we expect that our approaches will contribute towards the development of a biological measure for the diagnosis of the ADHD subjects.
Show less - Date Issued
- 2014
- Identifier
- CFE0005786, ucf:50060
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0005786