Current Search: Pensky, Marianna (x)
View All Items
Pages
- Title
- APPLICATION OF STATISTICAL METHODS IN RISK AND RELIABILITY.
- Creator
-
Heard, Astrid, Pensky, Marianna, University of Central Florida
- Abstract / Description
-
The dissertation considers construction of confidence intervals for a cumulative distribution function F(z) and its inverse at some fixed points z and u on the basis of an i.i.d. sample where the sample size is relatively small. The sample is modeled as having the flexible Generalized Gamma distribution with all three parameters being unknown. This approach can be viewed as an alternative to nonparametric techniques which do not specify distribution of X and lead to less efficient procedures....
Show moreThe dissertation considers construction of confidence intervals for a cumulative distribution function F(z) and its inverse at some fixed points z and u on the basis of an i.i.d. sample where the sample size is relatively small. The sample is modeled as having the flexible Generalized Gamma distribution with all three parameters being unknown. This approach can be viewed as an alternative to nonparametric techniques which do not specify distribution of X and lead to less efficient procedures. The confidence intervals are constructed by objective Bayesian methods and use the Jeffreys noninformative prior. Performance of the resulting confidence intervals is studied via Monte Carlo simulations and compared to the performance of nonparametric confidence intervals based on binomial proportion. In addition, techniques for change point detection are analyzed and further evaluated via Monte Carlo simulations. The effect of a change point on the interval estimators is studied both analytically and via Monte Carlo simulations.
Show less - Date Issued
- 2005
- Identifier
- CFE0000736, ucf:46565
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0000736
- Title
- DECISION THEORY CLASSIFICATION OF HIGH-DIMENSIONAL VECTORS BASED ON SMALL SAMPLES.
- Creator
-
Bradshaw, David, Pensky, Marianna, University of Central Florida
- Abstract / Description
-
In this paper, we review existing classification techniques and suggest an entirely new procedure for the classification of high-dimensional vectors on the basis of a few training samples. The proposed method is based on the Bayesian paradigm and provides posterior probabilities that a new vector belongs to each of the classes, therefore it adapts naturally to any number of classes. Our classification technique is based on a small vector which is related to the projection of the observation...
Show moreIn this paper, we review existing classification techniques and suggest an entirely new procedure for the classification of high-dimensional vectors on the basis of a few training samples. The proposed method is based on the Bayesian paradigm and provides posterior probabilities that a new vector belongs to each of the classes, therefore it adapts naturally to any number of classes. Our classification technique is based on a small vector which is related to the projection of the observation onto the space spanned by the training samples. This is achieved by employing matrix-variate distributions in classification, which is an entirely new idea. In addition, our method mimics time-tested classification techniques based on the assumption of normally distributed samples. By assuming that the samples have a matrix-variate normal distribution, we are able to replace classification on the basis of a large covariance matrix with classification on the basis of a smaller matrix that describes the relationship of sample vectors to each other.
Show less - Date Issued
- 2005
- Identifier
- CFE0000753, ucf:46593
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0000753
- Title
- EXPLORING CONFIDENCE INTERVALS IN THE CASE OF BINOMIAL AND HYPERGEOMETRIC DISTRIBUTIONS.
- Creator
-
Mojica, Irene, Pensky, Marianna, University of Central Florida
- Abstract / Description
-
The objective of this thesis is to examine one of the most fundamental and yet important methodologies used in statistical practice, interval estimation of the probability of success in a binomial distribution. The textbook confidence interval for this problem is known as the Wald interval as it comes from the Wald large sample test for the binomial case. It is generally acknowledged that the actual coverage probability of the standard interval is poor for values of p near 0 or 1. Moreover,...
Show moreThe objective of this thesis is to examine one of the most fundamental and yet important methodologies used in statistical practice, interval estimation of the probability of success in a binomial distribution. The textbook confidence interval for this problem is known as the Wald interval as it comes from the Wald large sample test for the binomial case. It is generally acknowledged that the actual coverage probability of the standard interval is poor for values of p near 0 or 1. Moreover, recently it has been documented that the coverage properties of the standard interval can be inconsistent even if p is not near the boundaries. For this reason, one would like to study the variety of methods for construction of confidence intervals for unknown probability p in the binomial case. The present thesis accomplishes the task by presenting several methods for constructing confidence intervals for unknown binomial probability p. It is well known that the hypergeometric distribution is related to the binomial distribution. In particular, if the size of the population, N, is large and the number of items of interest k is such that k/N tends to p as N grows, then the hypergeometric distribution can be approximated by the binomial distribution. Therefore, in this case, one can use the confidence intervals constructed for p in the case of the binomial distribution as a basis for construction of the confidence intervals for the unknown value k = pN. The goal of this thesis is to study this approximation and to point out several confidence intervals which are designed specifically for the hypergeometric distribution. In particular, this thesis considers several confidence intervals which are based on estimation of a binomial proportion as well as Bayesian credible sets based on various priors.
Show less - Date Issued
- 2011
- Identifier
- CFE0003919, ucf:48740
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0003919
- Title
- Sampling and Subspace Methods for Learning Sparse Group Structures in Computer Vision.
- Creator
-
Jaberi, Maryam, Foroosh, Hassan, Pensky, Marianna, Gong, Boqing, Qi, GuoJun, Pensky, Marianna, University of Central Florida
- Abstract / Description
-
The unprecedented growth of data in volume and dimension has led to an increased number of computationally-demanding and data-driven decision-making methods in many disciplines, such as computer vision, genomics, finance, etc. Research on big data aims to understand and describe trends in massive volumes of high-dimensional data. High volume and dimension are the determining factors in both computational and time complexity of algorithms. The challenge grows when the data are formed of the...
Show moreThe unprecedented growth of data in volume and dimension has led to an increased number of computationally-demanding and data-driven decision-making methods in many disciplines, such as computer vision, genomics, finance, etc. Research on big data aims to understand and describe trends in massive volumes of high-dimensional data. High volume and dimension are the determining factors in both computational and time complexity of algorithms. The challenge grows when the data are formed of the union of group-structures of different dimensions embedded in a high-dimensional ambient space.To address the problem of high volume, we propose a sampling method referred to as the Sparse Withdrawal of Inliers in a First Trial (SWIFT), which determines the smallest sample size in one grab so that all group-structures are adequately represented and discovered with high probability. The key features of SWIFT are: (i) sparsity, which is independent of the population size; (ii) no prior knowledge of the distribution of data, or the number of underlying group-structures; and (iii) robustness in the presence of an overwhelming number of outliers. We report a comprehensive study of the proposed sampling method in terms of accuracy, functionality, and effectiveness in reducing the computational cost in various applications of computer vision. In the second part of this dissertation, we study dimensionality reduction for multi-structural data. We propose a probabilistic subspace clustering method that unifies soft- and hard-clustering in a single framework. This is achieved by introducing a delayed association of uncertain points to subspaces of lower dimensions based on a confidence measure. Delayed association yields higher accuracy in clustering subspaces that have ambiguities, i.e. due to intersections and high-level of outliers/noise, and hence leads to more accurate self-representation of underlying subspaces. Altogether, this dissertation addresses the key theoretical and practically issues of size and dimension in big data analysis.
Show less - Date Issued
- 2018
- Identifier
- CFE0007017, ucf:52039
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007017
- Title
- Computational imaging systems for high-speed, adaptive sensing applications.
- Creator
-
Sun, Yangyang, Pang, Sean, Li, Guifang, Schulzgen, Axel, Pensky, Marianna, University of Central Florida
- Abstract / Description
-
Driven by the advances in signal processing and ubiquitous availability of high-speed low-cost computing resources over the past decade, computational imaging has seen the growing interest. Improvements on spatial, temporal, and spectral resolutions have been made with novel designs of imaging systems and optimization methods. However, there are two limitations in computational imaging. 1), Computational imaging requires full knowledge and representation of the imaging system called the...
Show moreDriven by the advances in signal processing and ubiquitous availability of high-speed low-cost computing resources over the past decade, computational imaging has seen the growing interest. Improvements on spatial, temporal, and spectral resolutions have been made with novel designs of imaging systems and optimization methods. However, there are two limitations in computational imaging. 1), Computational imaging requires full knowledge and representation of the imaging system called the forward model to reconstruct the object of interest. This limits the applications in the systems with a parameterized unknown forward model such as range imaging systems. 2), the regularization in the optimization process incorporates strong assumptions which may not accurately reflect the a priori distribution of the object. To overcome these limitations, we propose 1) novel optimization frameworks for applying computational imaging on active and passive range imaging systems and achieve 5-10 folds improvement on temporal resolution in various range imaging systems; 2) a data-driven method for estimating the distribution of high dimensional objects and a framework of adaptive sensing for maximum information gain. The adaptive strategy with our proposed method outperforms Gaussian process-based method consistently. The work would potentially benefit high-speed 3D imaging applications such as autonomous driving and adaptive sensing applications such as low-dose adaptive computed tomography(CT).
Show less - Date Issued
- 2019
- Identifier
- CFE0007867, ucf:52784
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007867
- Title
- Optimization Algorithms for Deep Learning Based Medical Image Segmentations.
- Creator
-
Mortazi, Aliasghar, Bagci, Ulas, Shah, Mubarak, Mahalanobis, Abhijit, Pensky, Marianna, University of Central Florida
- Abstract / Description
-
Medical image segmentation is one of the fundamental processes to understand and assess the functionality of different organs and tissues as well as quantifying diseases and helping treatmentplanning. With ever increasing number of medical scans, the automated, accurate, and efficient medical image segmentation is as unmet need for improving healthcare. Recently, deep learn-ing has emerged as one the most powerful methods for almost all image analysis tasks such as segmentation, detection,...
Show moreMedical image segmentation is one of the fundamental processes to understand and assess the functionality of different organs and tissues as well as quantifying diseases and helping treatmentplanning. With ever increasing number of medical scans, the automated, accurate, and efficient medical image segmentation is as unmet need for improving healthcare. Recently, deep learn-ing has emerged as one the most powerful methods for almost all image analysis tasks such as segmentation, detection, and classification and so in medical imaging. In this regard, this dissertation introduces new algorithms to perform medical image segmentation for different (a) imaging modalities, (b) number of objects, (c) dimensionality of images, and (d) under varying labelingconditions. First, we study dimensionality problem by introducing a new 2.5D segmentation engine that can be used in single and multi-object settings. We propose new fusion strategies and loss functions for deep neural networks to generate improved delineations. Later, we expand the proposed idea into 3D and 4D medical images and develop a "budget (computational) friendly"architecture search algorithm to make this process self-contained and fully automated without scarifying accuracy. Instead of manual architecture design, which is often based on plug-in and out and expert experience, the new algorithm provides an automated search of successful segmentation architecture within a short period of time. Finally, we study further optimization algorithms on label noise issue and improve overall segmentation problem by incorporating prior information about label noise and object shape information. We conclude the thesis work by studying different network and hyperparameter optimization settings that are fine-tuned for varying conditions for medical images. Applications are chosen from cardiac scans (images) and efficacy of the proposed algorithms are demonstrated on several data sets publicly available, and independently validated by blind evaluations.
Show less - Date Issued
- 2019
- Identifier
- CFE0007841, ucf:52825
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007841
- Title
- Learning Algorithms for Fat Quantification and Tumor Characterization.
- Creator
-
Hussein, Sarfaraz, Bagci, Ulas, Shah, Mubarak, Heinrich, Mark, Pensky, Marianna, University of Central Florida
- Abstract / Description
-
Obesity is one of the most prevalent health conditions. About 30% of the world's and over 70% of the United States' adult populations are either overweight or obese, causing an increased risk for cardiovascular diseases, diabetes, and certain types of cancer. Among all cancers, lung cancer is the leading cause of death, whereas pancreatic cancer has the poorest prognosis among all major cancers. Early diagnosis of these cancers can save lives. This dissertation contributes towards the...
Show moreObesity is one of the most prevalent health conditions. About 30% of the world's and over 70% of the United States' adult populations are either overweight or obese, causing an increased risk for cardiovascular diseases, diabetes, and certain types of cancer. Among all cancers, lung cancer is the leading cause of death, whereas pancreatic cancer has the poorest prognosis among all major cancers. Early diagnosis of these cancers can save lives. This dissertation contributes towards the development of computer-aided diagnosis tools in order to aid clinicians in establishing the quantitative relationship between obesity and cancers. With respect to obesity and metabolism, in the first part of the dissertation, we specifically focus on the segmentation and quantification of white and brown adipose tissue. For cancer diagnosis, we perform analysis on two important cases: lung cancer and Intraductal Papillary Mucinous Neoplasm (IPMN), a precursor to pancreatic cancer. This dissertation proposes an automatic body region detection method trained with only a single example. Then a new fat quantification approach is proposed which is based on geometric and appearance characteristics. For the segmentation of brown fat, a PET-guided CT co-segmentation method is presented. With different variants of Convolutional Neural Networks (CNN), supervised learning strategies are proposed for the automatic diagnosis of lung nodules and IPMN. In order to address the unavailability of a large number of labeled examples required for training, unsupervised learning approaches for cancer diagnosis without explicit labeling are proposed. We evaluate our proposed approaches (both supervised and unsupervised) on two different tumor diagnosis challenges: lung and pancreas with 1018 CT and 171 MRI scans respectively. The proposed segmentation, quantification and diagnosis approaches explore the important adiposity-cancer association and help pave the way towards improved diagnostic decision making in routine clinical practice.
Show less - Date Issued
- 2018
- Identifier
- CFE0007196, ucf:52288
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007196
- Title
- Solution of linear ill-posed problems using overcomplete dictionaries.
- Creator
-
Gupta, Pawan, Pensky, Marianna, Swanson, Jason, Zhang, Teng, Foroosh, Hassan, University of Central Florida
- Abstract / Description
-
In this dissertation, we consider an application of overcomplete dictionaries to the solution of general ill-posed linear inverse problems. In the context of regression problems, there has been an enormous amount of effort to recover an unknown function using such dictionaries. While some research on the subject has been already carried out, there are still many gaps to address. In particular, one of the most popular methods, lasso, and its variants, is based on minimizing the empirical...
Show moreIn this dissertation, we consider an application of overcomplete dictionaries to the solution of general ill-posed linear inverse problems. In the context of regression problems, there has been an enormous amount of effort to recover an unknown function using such dictionaries. While some research on the subject has been already carried out, there are still many gaps to address. In particular, one of the most popular methods, lasso, and its variants, is based on minimizing the empirical likelihood and unfortunately, requires stringent assumptions on the dictionary, the so-called, compatibility conditions. Though compatibility conditions are hard to satisfy, it is well known that this can be accomplished by using random dictionaries. In the first part of the dissertation, we show how one can apply random dictionaries to the solution of ill-posed linear inverse problems with Gaussian noise. We put a theoretical foundation under the suggested methodology and study its performance via simulations and real-data example. In the second part of the dissertation, we investigate the application of lasso to the linear ill-posed problems with non-Gaussian noise. We have developed a theoretical background for the application of lasso to such problems and studied its performance via simulations.
Show less - Date Issued
- 2019
- Identifier
- CFE0007811, ucf:52345
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007811
- Title
- Autoregressive Models.
- Creator
-
Wade, William, Richardson, Gary, Pensky, Marianna, Li, Xin, University of Central Florida
- Abstract / Description
-
Consider a sequence of random variables which obeys a first order autoregressive model with unknown parameter alpha. Under suitable assumptions on the error structure of the model, the limiting distribution of the normalized least squares estimator of alpha is discussed. The choice of the normalizing constant depends on whether alpha is less than one, equals one, or is greater than one in absolute value. In particular, the limiting distribution is normal provided that the absolute value of...
Show moreConsider a sequence of random variables which obeys a first order autoregressive model with unknown parameter alpha. Under suitable assumptions on the error structure of the model, the limiting distribution of the normalized least squares estimator of alpha is discussed. The choice of the normalizing constant depends on whether alpha is less than one, equals one, or is greater than one in absolute value. In particular, the limiting distribution is normal provided that the absolute value of alpha is less than one, but is a function of Brownian motion whenever the absolute value of alpha equals one. Some general remarks are made whenever the sequence of random variables is a first order moving average process.
Show less - Date Issued
- 2012
- Identifier
- CFE0004276, ucf:49546
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0004276
- Title
- Improved Interpolation in SPH in Cases of Less Smooth Flow.
- Creator
-
Brun, Oddny, Wiegand, Rudolf, Pensky, Marianna, University of Central Florida
- Abstract / Description
-
ABSTRACTWe introduced a method presented in Information Field Theory (IFT) [Abramovich et al.,2007] to improve interpolation in Smoothed Particle Hydrodynamics (SPH) in cases of less smoothflow. The method makes use of wavelet theory combined with B-splines for interpolation. The ideais to identify any jumps a function may have and then reconstruct the smoother segments betweenthe jumps. The results of our work demonstrated superior capability when compared to a particularchallenging SPH...
Show moreABSTRACTWe introduced a method presented in Information Field Theory (IFT) [Abramovich et al.,2007] to improve interpolation in Smoothed Particle Hydrodynamics (SPH) in cases of less smoothflow. The method makes use of wavelet theory combined with B-splines for interpolation. The ideais to identify any jumps a function may have and then reconstruct the smoother segments betweenthe jumps. The results of our work demonstrated superior capability when compared to a particularchallenging SPH application, to better conserve jumps and more accurately interpolate thesmoother segments of the function. The results of our work also demonstrated increased computationalefficiency with limited loss in accuracy as number of multiplications and execution timewere reduced. Similar benefits were observed for functions with spikes analyzed by the samemethod. Lesser, but similar effects were also demonstrated for real life data sets of less smoothnature.SPH is widely used in modeling and simulation of flow of matters. SPH presents advantagescompared to grid based methods both in terms of computational efficiency and accuracy, inparticular when dealing with less smooth flow. The results we achieved through our research is animprovement to the model in cases of less smooth flow, in particular flow with jumps and spikes.Up until now such improvements have been sought through modifications to the models' physicalequations and/or kernel functions and have only partially been able to address the issue.This research, as it introduced wavelet theory and IFT to a field of science that, to ourknowledge, not currently are utilizing these methods, did lay the groundwork for future researchiiiideas to benefit SPH. Among those ideas are further development of criteria for wavelet selection,use of smoothing splines for SPH interpolation and incorporation of Bayesian field theory.Improving the method's accuracy, stability and efficiency under more challenging conditionssuch as flow with jumps and spikes, will benefit applications in a wide area of science. Justin medicine alone, such improvements will further increase real time diagnostics, treatments andtraining opportunities because jumps and spikes are often the characteristics of significant physiologicaland anatomic conditions such as pulsatile blood flow, peristaltic intestine contractions andorgans' edges appearance in imaging.
Show less - Date Issued
- 2016
- Identifier
- CFE0006446, ucf:51451
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006446
- Title
- Super Resolution of Wavelet-Encoded Images and Videos.
- Creator
-
Atalay, Vildan, Foroosh, Hassan, Bagci, Ulas, Hughes, Charles, Pensky, Marianna, University of Central Florida
- Abstract / Description
-
In this dissertation, we address the multiframe super resolution reconstruction problem for wavelet-encoded images and videos. The goal of multiframe super resolution is to obtain one or more high resolution images by fusing a sequence of degraded or aliased low resolution images of the same scene. Since the low resolution images may be unaligned, a registration step is required before super resolution reconstruction. Therefore, we first explore in-band (i.e. in the wavelet-domain) image...
Show moreIn this dissertation, we address the multiframe super resolution reconstruction problem for wavelet-encoded images and videos. The goal of multiframe super resolution is to obtain one or more high resolution images by fusing a sequence of degraded or aliased low resolution images of the same scene. Since the low resolution images may be unaligned, a registration step is required before super resolution reconstruction. Therefore, we first explore in-band (i.e. in the wavelet-domain) image registration; then, investigate super resolution.Our motivation for analyzing the image registration and super resolution problems in the wavelet domain is the growing trend in wavelet-encoded imaging, and wavelet-encoding for image/video compression. Due to drawbacks of widely used discrete cosine transform in image and video compression, a considerable amount of literature is devoted to wavelet-based methods. However, since wavelets are shift-variant, existing methods cannot utilize wavelet subbands efficiently. In order to overcome this drawback, we establish and explore the direct relationship between the subbands under a translational shift, for image registration and super resolution. We then employ our devised in-band methodology, in a motion compensated video compression framework, to demonstrate the effective usage of wavelet subbands.Super resolution can also be used as a post-processing step in video compression in order to decrease the size of the video files to be compressed, with downsampling added as a pre-processing step. Therefore, we present a video compression scheme that utilizes super resolution to reconstruct the high frequency information lost during downsampling. In addition, super resolution is a crucial post-processing step for satellite imagery, due to the fact that it is hard to update imaging devices after a satellite is launched. Thus, we also demonstrate the usage of our devised methods in enhancing resolution of pansharpened multispectral images.
Show less - Date Issued
- 2017
- Identifier
- CFE0006854, ucf:51744
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006854
- Title
- Nonparametric and Empirical Bayes Estimation Methods.
- Creator
-
Benhaddou, Rida, Pensky, Marianna, Han, Deguang, Swanson, Jason, Ni, Liqiang, University of Central Florida
- Abstract / Description
-
In the present dissertation, we investigate two different nonparametric models; empirical Bayes model and functional deconvolution model. In the case of the nonparametric empirical Bayes estimation, we carried out a complete minimax study. In particular, we derive minimax lower bounds for the risk of the nonparametric empirical Bayes estimator for a general conditional distribution. This result has never been obtained previously. In order to attain optimal convergence rates, we use a wavelet...
Show moreIn the present dissertation, we investigate two different nonparametric models; empirical Bayes model and functional deconvolution model. In the case of the nonparametric empirical Bayes estimation, we carried out a complete minimax study. In particular, we derive minimax lower bounds for the risk of the nonparametric empirical Bayes estimator for a general conditional distribution. This result has never been obtained previously. In order to attain optimal convergence rates, we use a wavelet series based empirical Bayes estimator constructed in Pensky and Alotaibi (2005). We propose an adaptive version of this estimator using Lepski's method and show that the estimator attains optimal convergence rates. The theory is supplemented by numerous examples. Our study of the functional deconvolution model expands results of Pensky and Sapatinas (2009, 2010, 2011) to the case of estimating an $(r+1)$-dimensional function or dependent errors. In both cases, we derive minimax lower bounds for the integrated square risk over a wide set of Besov balls and construct adaptive wavelet estimators that attain those optimal convergence rates. In particular, in the case of estimating a periodic $(r+1)$-dimensional function, we show that by choosing Besov balls of mixed smoothness, we can avoid the ''curse of dimensionality'' and, hence, obtain higher than usual convergence rates when $r$ is large. The study of deconvolution of a multivariate function is motivated by seismic inversion which can be reduced to solution of noisy two-dimensional convolution equations that allow to draw inference on underground layer structures along the chosen profiles. The common practice in seismology is to recover layer structures separately for each profile and then to combine the derived estimates into a two-dimensional function. By studying the two-dimensional version of the model, we demonstrate that this strategy usually leads to estimators which are less accurate than the ones obtained as two-dimensional functional deconvolutions. Finally, we consider a multichannel deconvolution model with long-range dependent Gaussian errors. We do not limit our consideration to a specific type of long-range dependence, rather we assume that the eigenvalues of the covariance matrix of the errors are bounded above and below. We show that convergence rates of the estimators depend on a balance between the smoothness parameters of the response function, the smoothness of the blurring function, the long memory parameters of the errors, and how the total number of observations is distributed among the channels.
Show less - Date Issued
- 2013
- Identifier
- CFE0004814, ucf:49737
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0004814
- Title
- Accelerated Life Model with Various Types of Censored Data.
- Creator
-
Pridemore, Kathryn, Pensky, Marianna, Mikusinski, Piotr, Swanson, Jason, Nickerson, David, University of Central Florida
- Abstract / Description
-
The Accelerated Life Model is one of the most commonly used tools in the analysis of survival data which are frequently encountered in medical research and reliability studies. In these types of studies we often deal with complicated data sets for which we cannot observe the complete data set in practical situations due to censoring. Such difficulties are particularly apparent by the fact that there is little work in statistical literature on the Accelerated Life Model for complicated types...
Show moreThe Accelerated Life Model is one of the most commonly used tools in the analysis of survival data which are frequently encountered in medical research and reliability studies. In these types of studies we often deal with complicated data sets for which we cannot observe the complete data set in practical situations due to censoring. Such difficulties are particularly apparent by the fact that there is little work in statistical literature on the Accelerated Life Model for complicated types of censored data sets, such as doubly censored data, interval censored data, and partly interval censored data.In this work, we use the Weighted Empirical Likelihood approach (Ren, 2001) to construct tests, confidence intervals, and goodness-of-fit tests for the Accelerated Life Model in a unified way for various types of censored data. We also provide algorithms for implementation and present relevant simulation results.I began working on this problem with Dr. Jian-Jian Ren. Upon Dr. Ren's departure from the University of Central Florida I completed this dissertation under the supervision of Dr. Marianna Pensky.
Show less - Date Issued
- 2013
- Identifier
- CFE0004913, ucf:49613
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0004913
- Title
- Functional Data Analysis and its application to cancer data.
- Creator
-
Martinenko, Evgeny, Pensky, Marianna, Tamasan, Alexandru, Swanson, Jason, Richardson, Gary, University of Central Florida
- Abstract / Description
-
The objective of the current work is to develop novel procedures for the analysis of functional dataand apply them for investigation of gender disparity in survival of lung cancer patients. In particular,we use the time-dependent Cox proportional hazards model where the clinical information isincorporated via time-independent covariates, and the current age is modeled using its expansionover wavelet basis functions. We developed computer algorithms and applied them to the dataset which is...
Show moreThe objective of the current work is to develop novel procedures for the analysis of functional dataand apply them for investigation of gender disparity in survival of lung cancer patients. In particular,we use the time-dependent Cox proportional hazards model where the clinical information isincorporated via time-independent covariates, and the current age is modeled using its expansionover wavelet basis functions. We developed computer algorithms and applied them to the dataset which is derived from Florida Cancer Data depository data set (all personal information whichallows to identify patients was eliminated). We also studied the problem of estimation of a continuousmatrix-variate function of low rank. We have constructed an estimator of such functionusing its basis expansion and subsequent solution of an optimization problem with the Schattennormpenalty. We derive an oracle inequality for the constructed estimator, study its properties viasimulations and apply the procedure to analysis of Dynamic Contrast medical imaging data.
Show less - Date Issued
- 2014
- Identifier
- CFE0005377, ucf:50447
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0005377
- Title
- Creation and Application of Routines for Determining Physical Properties of Asteroids and Exoplanets from Low Signal-To-Noise Data Sets.
- Creator
-
Lust, Nathaniel, Britt, Daniel, Fernandez, Yan, Pensky, Marianna, Harris, Alan, University of Central Florida
- Abstract / Description
-
Astronomy is a data heavy field driven by observations of remote sources reflecting or emitting light. These signals are transient in nature, which makes it very important to fully utilize every observation. This however is often difficult due to the faintness of these observations, often are only slightly above the level of observational noise. We present new or adapted methodologies for dealing with these low signal-to-noise scenarios, along with practical examples including determining...
Show moreAstronomy is a data heavy field driven by observations of remote sources reflecting or emitting light. These signals are transient in nature, which makes it very important to fully utilize every observation. This however is often difficult due to the faintness of these observations, often are only slightly above the level of observational noise. We present new or adapted methodologies for dealing with these low signal-to-noise scenarios, along with practical examples including determining exoplanet physical properties, periodicities in asteroids, and the rotational and orbital properties of the multiple asteroid system 2577 Litva.
Show less - Date Issued
- 2014
- Identifier
- CFE0005523, ucf:50307
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0005523
- Title
- Robust, Scalable, and Provable Approaches to High Dimensional Unsupervised Learning.
- Creator
-
Rahmani, Mostafa, Atia, George, Vosoughi, Azadeh, Mikhael, Wasfy, Nashed, M, Pensky, Marianna, University of Central Florida
- Abstract / Description
-
This doctoral thesis focuses on three popular unsupervised learning problems: subspace clustering, robust PCA, and column sampling. For the subspace clustering problem, a new transformative idea is presented. The proposed approach, termed Innovation Pursuit, is a new geometrical solution to the subspace clustering problem whereby subspaces are identified based on their relative novelties. A detailed mathematical analysis is provided establishing sufficient conditions for the proposed method...
Show moreThis doctoral thesis focuses on three popular unsupervised learning problems: subspace clustering, robust PCA, and column sampling. For the subspace clustering problem, a new transformative idea is presented. The proposed approach, termed Innovation Pursuit, is a new geometrical solution to the subspace clustering problem whereby subspaces are identified based on their relative novelties. A detailed mathematical analysis is provided establishing sufficient conditions for the proposed method to correctly cluster the data points. The numerical simulations with both real and synthetic data demonstrate that Innovation Pursuit notably outperforms the state-of-the-art subspace clustering algorithms. For the robust PCA problem, we focus on both the outlier detection and the matrix decomposition problems. For the outlier detection problem, we present a new algorithm, termed Coherence Pursuit, in addition to two scalable randomized frameworks for the implementation of outlier detection algorithms. The Coherence Pursuit method is the first provable and non-iterative robust PCA method which is provably robust to both unstructured and structured outliers. Coherence Pursuit is remarkably simple and it notably outperforms the existing methods in dealing with structured outliers. In the proposed randomized designs, we leverage the low dimensional structure of the low rank component to apply the robust PCA algorithm to a random sketch of the data as opposed to the full scale data. Importantly, it is analytically shown that the presented randomized designs can make the computation or sample complexity of the low rank matrix recovery algorithm independent of the size of the data. At the end, we focus on the column sampling problem. A new sampling tool, dubbed Spatial Random Sampling, is presented which performs the random sampling in the spatial domain. The most compelling feature of Spatial Random Sampling is that it is the first unsupervised column sampling method which preserves the spatial distribution of the data.
Show less - Date Issued
- 2018
- Identifier
- CFE0007083, ucf:52010
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007083
- Title
- Estimation and clustering in statistical ill-posed linear inverse problems.
- Creator
-
Rajapakshage, Rasika, Pensky, Marianna, Swanson, Jason, Zhang, Teng, Bagci, Ulas, Foroosh, Hassan, University of Central Florida
- Abstract / Description
-
The main focus of the dissertation is estimation and clustering in statistical ill-posed linear inverse problems. The dissertation deals with a problem of simultaneously estimating a collection of solutions of ill-posed linear inverse problems from their noisy images under an operator that does not have a bounded inverse, when the solutions are related in a certain way. The dissertation defense consists of three parts. In the first part, the collection consists of measurements of temporal...
Show moreThe main focus of the dissertation is estimation and clustering in statistical ill-posed linear inverse problems. The dissertation deals with a problem of simultaneously estimating a collection of solutions of ill-posed linear inverse problems from their noisy images under an operator that does not have a bounded inverse, when the solutions are related in a certain way. The dissertation defense consists of three parts. In the first part, the collection consists of measurements of temporal functions at various spatial locations. In particular, we studythe problem of estimating a three-dimensional function based on observations of its noisy Laplace convolution. In the second part, we recover classes of similar curves when the class memberships are unknown. Problems of this kind appear in many areas of application where clustering is carried out at the pre-processing step and then the inverse problem is solved for each of the cluster averages separately. As a result, the errors of the procedures are usually examined for the estimation step only. In both parts, we construct the estimators, study their minimax optimality and evaluate their performance via a limited simulation study. In the third part, we propose a new computational platform to better understand the patterns of R-fMRI by taking into account the challenge of inevitable signal fluctuations and interpretthe success of dynamic functional connectivity approaches. Towards this, we revisit an auto-regressive and vector auto-regressive signal modeling approach for estimating temporal changes of the signal in brain regions. We then generate inverse covariance matrices fromthe generated windows and use a non-parametric statistical approach to select significant features. Finally, we use Lasso to perform classification of the data. The effectiveness of theproposed method is evidenced in the classification of R-fMRI scans
Show less - Date Issued
- 2019
- Identifier
- CFE0007710, ucf:52450
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007710
- Title
- Bayesian Model Selection for Classification with Possibly Large Number of Groups.
- Creator
-
Davis, Justin, Pensky, Marianna, Swanson, Jason, Richardson, Gary, Crampton, William, Ni, Liqiang, University of Central Florida
- Abstract / Description
-
The purpose of the present dissertation is to study model selection techniques which are specifically designed for classification of high-dimensional data with a large number of classes. To the best of our knowledge, this problem has never been studied in depth previously. We assume that the number of components p is much larger than the number of samples n, and that only few of those p components are useful for subsequent classification. In what follows, we introduce two Bayesian models...
Show moreThe purpose of the present dissertation is to study model selection techniques which are specifically designed for classification of high-dimensional data with a large number of classes. To the best of our knowledge, this problem has never been studied in depth previously. We assume that the number of components p is much larger than the number of samples n, and that only few of those p components are useful for subsequent classification. In what follows, we introduce two Bayesian models which use two different approaches to the problem: one which discards components which have "almost constant" values (Model 1) and another which retains the components for which between-group variations are larger than within-group variation (Model 2). We show that particular cases of the above two models recover familiar variance or ANOVA-based component selection. When one has only two classes and features are a priori independent, Model 2 reduces to the Feature Annealed Independence Rule (FAIR) introduced by Fan and Fan (2008) and can be viewed as a natural generalization to the case of L (>) 2 classes. A nontrivial result of the dissertation is that the precision of feature selection using Model 2 improves when the number of classes grows. Subsequently, we examine the rate of misclassification with and without feature selection on the basis of Model 2.
Show less - Date Issued
- 2011
- Identifier
- CFE0004097, ucf:49091
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0004097
- Title
- Data-driven Predictive Analytics For Distributed Smart Grid Control: Optimization of Energy Storage, Voltage and Demand Response.
- Creator
-
Valizadehhaghi, Hamed, Qu, Zhihua, Behal, Aman, Atia, George, Turgut, Damla, Pensky, Marianna, University of Central Florida
- Abstract / Description
-
The smart grid is expected to support an interconnected network of self-contained microgrids. Nonetheless, the distributed integration of renewable generation and demand response adds complexity to the control and optimization of smart grid. Forecasts are essential due to the existence of stochastic variations and uncertainty. Forecasting data are spatio-temporal which means that the data correspond to regular intervals, say every hour, and the analysis has to take account of spatial...
Show moreThe smart grid is expected to support an interconnected network of self-contained microgrids. Nonetheless, the distributed integration of renewable generation and demand response adds complexity to the control and optimization of smart grid. Forecasts are essential due to the existence of stochastic variations and uncertainty. Forecasting data are spatio-temporal which means that the data correspond to regular intervals, say every hour, and the analysis has to take account of spatial dependence among the distributed generators or locations. Hence, smart grid operations must take account of, and in fact benefit from the temporal dependence as well as the spatial dependence. This is particularly important considering the buffering effect of energy storage devices such as batteries, heating/cooling systems and electric vehicles. The data infrastructure of smart grid is the key to address these challenges, however, how to utilize stochastic modeling and forecasting tools for optimal and reliable planning, operation and control of smart grid remains an open issue.Utilities are seeking to become more proactive in decision-making, adjusting their strategies based on realistic predictive views into the future, thus allowing them to side-step problems and capitalize on the smart grid technologies, such as energy storage, that are now being deployed atscale. Predictive analytics, capable of managing intermittent loads, renewables, rapidly changing weather patterns and other grid conditions, represent the ultimate goal for smart grid capabilities.Within this framework, this dissertation develops high-performance analytics, such as predictive analytics, and ways of employing analytics to improve distributed and cooperative optimization software which proves to be the most significant value-add in the smart grid age, as new network management technologies prove reliable and fundamental. Proposed optimization and control approaches for active and reactive power control are robust to variations and offer a certain level of optimality by combining real-time control with hours-ahead network operation schemes. The main objective is managing spatial and temporal availability of the energy resources in different look-ahead time horizons. Stochastic distributed optimization is realized by integrating a distributed sub-gradient method with conditional ensemble predictions of the energy storage capacity and distributed generation. Hence, the obtained solutions can reflect on the system requirements for the upcoming times along with the instantaneous cooperation between distributed resources. As an important issue for smart grid, the conditional ensembles are studied for capturing wind, photovoltaic, and vehicle-to-grid availability variations. The following objectives are pursued:- Spatio-temporal adaptive modeling of data including electricity demand, electric vehicles and renewable energy (wind and solar power)- Predictive data analytics and forecasting- Distributed control- Integration of energy storage systemsFull distributional characterization and spatio-temporal modeling of data ensembles are utilized in order to retain the conditional and temporal interdependence between projection data and available capacity. Then, by imposing measures of the most likely ensembles, the distributed control method is carried out for cooperative optimization of the renewable generation and energy storage within the smart grid.
Show less - Date Issued
- 2016
- Identifier
- CFE0006408, ucf:51481
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006408
- Title
- Compressive Sensing and Recovery of Structured Sparse Signals.
- Creator
-
Shahrasbi, Behzad, Rahnavard, Nazanin, Vosoughi, Azadeh, Wei, Lei, Atia, George, Pensky, Marianna, University of Central Florida
- Abstract / Description
-
In the recent years, numerous disciplines including telecommunications, medical imaging, computational biology, and neuroscience benefited from increasing applications of high dimensional datasets. This calls for efficient ways of data capturing and data processing. Compressive sensing (CS), which is introduced as an efficient sampling (data capturing) method, is addressing this need. It is well-known that the signals, which belong to an ambient high-dimensional space, have much smaller...
Show moreIn the recent years, numerous disciplines including telecommunications, medical imaging, computational biology, and neuroscience benefited from increasing applications of high dimensional datasets. This calls for efficient ways of data capturing and data processing. Compressive sensing (CS), which is introduced as an efficient sampling (data capturing) method, is addressing this need. It is well-known that the signals, which belong to an ambient high-dimensional space, have much smaller dimensionality in an appropriate domain. CS taps into this principle and dramatically reduces the number of samples that is required to be captured to avoid any distortion in the information content of the data. This reduction in the required number of samples enables many new applications that were previously infeasible using classical sampling techniques.Most CS-based approaches take advantage of the inherent low-dimensionality in many datasets. They try to determine a sparse representation of the data, in an appropriately chosen basis using only a few significant elements. These approaches make no extra assumptions regarding possible relationships among the significant elements of that basis. In this dissertation, different ways of incorporating the knowledge about such relationships are integrated into the data sampling and the processing schemes.We first consider the recovery of temporally correlated sparse signals and show that using the time correlation model. The recovery performance can be significantly improved. Next, we modify the sampling process of sparse signals to incorporate the signal structure in a more efficient way. In the image processing application, we show that exploiting the structure information in both signal sampling and signal recovery improves the efficiency of the algorithm. In addition, we show that region-of-interest information can be included in the CS sampling and recovery steps to provide a much better quality for the region-of-interest area compared the rest of the image or video. In spectrum sensing applications, CS can dramatically improve the sensing efficiency by facilitating the coordination among spectrum sensors. A cluster-based spectrum sensing with coordination among spectrum sensors is proposed for geographically disperse cognitive radio networks. Further, CS has been exploited in this problem for simultaneous sensing and localization. Having access to this information dramatically facilitates the implementation of advanced communication technologies as required by 5G communication networks.
Show less - Date Issued
- 2015
- Identifier
- CFE0006392, ucf:51509
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006392