You are here
Machine Learning Methods for Multiparameter Flow Cytometry Analysis and Visualization
- Date Issued:
- 2018
- Abstract/Description:
- Flow cytometry is a popular analytical cell-biology instrument that uses specific wavelengths of light to profile heterogeneous populations of cells at the individual level. Current cytometers have the capability of analyzing up to 20 parameters on over a million cells, but despite the complexity of these datasets, a typical workflow relies on subjective labor-intensive manual sequential analysis. The research presented in this dissertation provides two machine learning methods to increase the objectivity, efficiency, and discovery in flow cytometry data analysis. The first, a supervised learning method, utilizes previously analyzed data to evaluate new flow cytometry files containing similar parameters. The probability distribution of each dimension in a file is matched to each related dimension of a reference file through color indexing and histogram intersection methods. Once a similar reference file is selected the cell populations previously classified are used to create a tailored support vector machine capable of classifying cell populations as an expert would. This method has produced results highly correlated with manual sequential analysis, providing an efficient alternative for analyzing a large number of samples. The second, a novel unsupervised method, is used to explore and visualize single-cell data in an objective manner. To accomplish this, a hypergraph sampling method was created to preserve rare events within the flow data before divisively clustering the sampled data using singular value decomposition. The unsampled data is added to the discovered set of clusters using a support vector machine classifier, and the final analysis is displayed as a minimum spanning tree. This tree is capable of distinguishing rare subsets of cells comprising of less than 1% of the original data.
Title: | Machine Learning Methods for Multiparameter Flow Cytometry Analysis and Visualization. |
![]() ![]() |
---|---|---|
Name(s): |
Sassano, Emily, Author Jha, Sumit Kumar, Committee Chair Pattanaik, Sumanta, Committee Member Hughes, Charles, Committee Member Moore, Sean, Committee Member University of Central Florida, Degree Grantor |
|
Type of Resource: | text | |
Date Issued: | 2018 | |
Publisher: | University of Central Florida | |
Language(s): | English | |
Abstract/Description: | Flow cytometry is a popular analytical cell-biology instrument that uses specific wavelengths of light to profile heterogeneous populations of cells at the individual level. Current cytometers have the capability of analyzing up to 20 parameters on over a million cells, but despite the complexity of these datasets, a typical workflow relies on subjective labor-intensive manual sequential analysis. The research presented in this dissertation provides two machine learning methods to increase the objectivity, efficiency, and discovery in flow cytometry data analysis. The first, a supervised learning method, utilizes previously analyzed data to evaluate new flow cytometry files containing similar parameters. The probability distribution of each dimension in a file is matched to each related dimension of a reference file through color indexing and histogram intersection methods. Once a similar reference file is selected the cell populations previously classified are used to create a tailored support vector machine capable of classifying cell populations as an expert would. This method has produced results highly correlated with manual sequential analysis, providing an efficient alternative for analyzing a large number of samples. The second, a novel unsupervised method, is used to explore and visualize single-cell data in an objective manner. To accomplish this, a hypergraph sampling method was created to preserve rare events within the flow data before divisively clustering the sampled data using singular value decomposition. The unsampled data is added to the discovered set of clusters using a support vector machine classifier, and the final analysis is displayed as a minimum spanning tree. This tree is capable of distinguishing rare subsets of cells comprising of less than 1% of the original data. | |
Identifier: | CFE0007243 (IID), ucf:52241 (fedora) | |
Note(s): |
2018-08-01 Ph.D. Engineering and Computer Science, Computer Science Doctoral This record was generated from author submitted information. |
|
Subject(s): | Flow cytometry -- SVM -- SVD -- automated gating | |
Persistent Link to This Record: | http://purl.flvc.org/ucf/fd/CFE0007243 | |
Restrictions on Access: | public 2018-08-15 | |
Host Institution: | UCF |