Current Search: model selection (x)
View All Items
- Title
- Bayesian Model Selection for Classification with Possibly Large Number of Groups.
- Creator
-
Davis, Justin, Pensky, Marianna, Swanson, Jason, Richardson, Gary, Crampton, William, Ni, Liqiang, University of Central Florida
- Abstract / Description
-
The purpose of the present dissertation is to study model selection techniques which are specifically designed for classification of high-dimensional data with a large number of classes. To the best of our knowledge, this problem has never been studied in depth previously. We assume that the number of components p is much larger than the number of samples n, and that only few of those p components are useful for subsequent classification. In what follows, we introduce two Bayesian models...
Show moreThe purpose of the present dissertation is to study model selection techniques which are specifically designed for classification of high-dimensional data with a large number of classes. To the best of our knowledge, this problem has never been studied in depth previously. We assume that the number of components p is much larger than the number of samples n, and that only few of those p components are useful for subsequent classification. In what follows, we introduce two Bayesian models which use two different approaches to the problem: one which discards components which have "almost constant" values (Model 1) and another which retains the components for which between-group variations are larger than within-group variation (Model 2). We show that particular cases of the above two models recover familiar variance or ANOVA-based component selection. When one has only two classes and features are a priori independent, Model 2 reduces to the Feature Annealed Independence Rule (FAIR) introduced by Fan and Fan (2008) and can be viewed as a natural generalization to the case of L (>) 2 classes. A nontrivial result of the dissertation is that the precision of feature selection using Model 2 improves when the number of classes grows. Subsequently, we examine the rate of misclassification with and without feature selection on the basis of Model 2.
Show less - Date Issued
- 2011
- Identifier
- CFE0004097, ucf:49091
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0004097
- Title
- Model Selection via Racing.
- Creator
-
Zhang, Tiantian, Georgiopoulos, Michael, Anagnostopoulos, Georgios, Wu, Annie, Hu, Haiyan, Nickerson, David, University of Central Florida
- Abstract / Description
-
Model Selection (MS) is an important aspect of machine learning, as necessitated by the No Free Lunch theorem. Briefly speaking, the task of MS is to identify a subset of models that are optimal in terms of pre-selected optimization criteria. There are many practical applications of MS, such as model parameter tuning, personalized recommendations, A/B testing, etc. Lately, some MS research has focused on trading off exactness of the optimization with somewhat alleviating the computational...
Show moreModel Selection (MS) is an important aspect of machine learning, as necessitated by the No Free Lunch theorem. Briefly speaking, the task of MS is to identify a subset of models that are optimal in terms of pre-selected optimization criteria. There are many practical applications of MS, such as model parameter tuning, personalized recommendations, A/B testing, etc. Lately, some MS research has focused on trading off exactness of the optimization with somewhat alleviating the computational burden entailed. Recent attempts along this line include metaheuristics optimization, local search-based approaches, sequential model-based methods, portfolio algorithm approaches, and multi-armed bandits.Racing Algorithms (RAs) are an active research area in MS, which trade off some computational cost for a reduced, but acceptable likelihood that the models returned are indeed optimal among the given ensemble of models. All existing RAs in the literature are designed as Single-Objective Racing Algorithm (SORA) for Single-Objective Model Selection (SOMS), where a single optimization criterion is considered for measuring the goodness of models. Moreover, they are offline algorithms in which MS occurs before model deployment and the selected models are optimal in terms of their overall average performances on a validation set of problem instances. This work aims to investigate racing approaches along two distinct directions: Extreme Model Selection (EMS) and Multi-Objective Model Selection (MOMS). In EMS, given a problem instance and a limited computational budget shared among all the candidate models, one is interested in maximizing the final solution quality. In such a setting, MS occurs during model comparison in terms of maximum performance and involves no model validation. EMS is a natural framework for many applications. However, EMS problems remain unaddressed by current racing approaches. In this work, the first RA for EMS, named Max-Race, is developed, so that it optimizes the extreme solution quality by automatically allocating the computational resources among an ensemble of problem solvers for a given problem instance. In Max-Race, significant difference between the extreme performances of any pair of models is statistically inferred via a parametric hypothesis test under the Generalized Pareto Distribution (GPD) assumption. Experimental results have confirmed that Max-Race is capable of identifying the best extreme model with high accuracy and low computational cost. Furthermore, in machine learning, as well as in many real-world applications, a variety of MS problems are multi-objective in nature. MS which simultaneously considers multiple optimization criteria is referred to as MOMS. Under this scheme, a set of Pareto optimal models is sought that reflect a variety of compromises between optimization objectives. So far, MOMS problems have received little attention in the relevant literature. Therefore, this work also develops the first Multi-Objective Racing Algorithm (MORA) for a fixed-budget setting, namely S-Race. S-Race addresses MOMS in the proper sense of Pareto optimality. Its key decision mechanism is the non-parametric sign test, which is employed for inferring pairwise dominance relationships. Moreover, S-Race is able to strictly control the overall probability of falsely eliminating any non-dominated models at a user-specified significance level. Additionally, SPRINT-Race, the first MORA for a fixed-confidence setting, is also developed. In SPRINT-Race, pairwise dominance and non-dominance relationships are established via the Sequential Probability Ratio Test with an Indifference zone. Moreover, the overall probability of falsely eliminating any non-dominated models or mistakenly retaining any dominated models is controlled at a prescribed significance level. Extensive experimental analysis has demonstrated the efficiency and advantages of both S-Race and SPRINT-Race in MOMS.
Show less - Date Issued
- 2016
- Identifier
- CFE0006203, ucf:51094
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006203
- Title
- Business in the Estuary, Party in the Sea: Migration Patterns of Striped Mullet (Mugil cephalus) Within the Indian River Lagoon Complex.
- Creator
-
Myers, Olivia, Cook, Geoffrey, Mansfield, Kate, Reyier, Eric, University of Central Florida
- Abstract / Description
-
Commercial and recreational environmental enterprises in the Indian River Lagoon (IRL), Florida supply nearly 10,000 jobs and produce $1.6 billion dollars a year in revenue. These waters contain iconic species of sportfish, including red drum, snook, and sea trout, as well as their lower trophic level prey such as snapper and mullet. Striped mullet (Mugil cephalus) are both commercially valuable as well as an indicator species for overall ecosystem health. From September to December, mullet...
Show moreCommercial and recreational environmental enterprises in the Indian River Lagoon (IRL), Florida supply nearly 10,000 jobs and produce $1.6 billion dollars a year in revenue. These waters contain iconic species of sportfish, including red drum, snook, and sea trout, as well as their lower trophic level prey such as snapper and mullet. Striped mullet (Mugil cephalus) are both commercially valuable as well as an indicator species for overall ecosystem health. From September to December, mullet in the IRL undergo an annual migration from their inshore foraging habitats to oceanic spawning sites. However, their actual migratory pathways remain unknown. To address this knowledge gap, I utilized passive acoustic telemetry to assess the migration patterns of M. cephalus within the IRL complex, particularly focusing on movement pathways from inshore aggregation sites to oceanic inlets to spawn. Coupling environmental metrics with movement data, I evaluated catalysts for migration as well as travel routes through the estuary. Network analyses identified potential conservation areas of interest and sites needing management intervention. Impoundments around the Merritt Island National Wildlife Refuge appear to serve as an important refuge area for striped mullet while the Banana and Indian Rivers act as corridors during their inshore migratory movements. The environmental metrics of depth, temperature, dissolved oxygen, pH, barometric pressure, and photoperiod were the best predictors for the number of detections and residency time produced by two case studies of striped mullet activity. An emphasis on spatial fisheries management along with vigilant environmental monitoring will ensure the status of this species, to the benefit of both natural and human systems in the Indian River Lagoon. The knowledge generated as a result of this project may also provide a framework for sustainably managing other migratory baitfish.
Show less - Date Issued
- 2019
- Identifier
- CFE0007895, ucf:52768
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007895
- Title
- INITIAL VALIDATION OF NOVEL PERFORMANCE-BASED MEASURES: MENTAL ROTATION AND PSYCHOMOTOR ABILITY.
- Creator
-
Fatolitis, Philip, Jentsch, Florian, University of Central Florida
- Abstract / Description
-
Given the high-risk nature of military flight operations and the significant resources required to train U.S. Naval Aviation personnel, continual improvement is required in the selection process. In addition to general commissioning requirements and aeromedical standards, the U.S. Navy utilizes the Aviation Selection Test Battery (ASTB) to select commissioned aviation students. Although the ASTB has been a good predictor of aviation student performance in training, it was proposed that...
Show moreGiven the high-risk nature of military flight operations and the significant resources required to train U.S. Naval Aviation personnel, continual improvement is required in the selection process. In addition to general commissioning requirements and aeromedical standards, the U.S. Navy utilizes the Aviation Selection Test Battery (ASTB) to select commissioned aviation students. Although the ASTB has been a good predictor of aviation student performance in training, it was proposed that incremental improvement could be gained with the introduction of novel, computer administered performance-based measures: Block Rotation (BRT) and a Navy-developed Compensatory Tracking task. This work constituted an initial validation of the BRT, an interactive virtual analog of Shepard-Metzler's (1971) Mental Rotation task that was developed with the intention of quantifying mental rotation and psychomotor ability. For Compensatory Tracking, this work sought to determine if data gathered concord with results in extant literature, confirming the validity of the task. Data from the BRT were examined to determine task reliability and to formulate relevant quantitative/predictive performance human models. Results showed that the BRT performance is a valid spatial ability predictor whose output can be modeled, and that Compensatory Tracking task data concord with the psychometric properties of tracking tasks that have been previously presented in the literature.
Show less - Date Issued
- 2008
- Identifier
- CFE0002413, ucf:47764
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0002413
- Title
- Development of a chemical kinetic model for the combustion of a synthesis gas from a fluidized-bed sewage sludge gasifier in a thermal oxidizer.
- Creator
-
Martinez, Luis, Cooper, David, Randall, Andrew, Vasu Sumathi, Subith, University of Central Florida
- Abstract / Description
-
The need for sustainability has been on the rise. Municipalities are finding ways of reducing waste, but also finding ways to reduce energy costs. Waste-to-energy is a sustainable method that may reduce bio-solids volume while also producing energy. In this research study bio-solids enters a bubbling bed gasifier and within the gasifier a synthesis gas is produced. This synthesis gas exits through the top of the gasifier and enters a thermal oxidizer for combustion. The thermal oxidizer has...
Show moreThe need for sustainability has been on the rise. Municipalities are finding ways of reducing waste, but also finding ways to reduce energy costs. Waste-to-energy is a sustainable method that may reduce bio-solids volume while also producing energy. In this research study bio-solids enters a bubbling bed gasifier and within the gasifier a synthesis gas is produced. This synthesis gas exits through the top of the gasifier and enters a thermal oxidizer for combustion. The thermal oxidizer has an innovative method of oxidizing the synthesis gas. The thermal oxidizer has two air injection sites and the possibility for aqueous ammonia injection for further NOx reduction. Most thermal oxidizers already include an oxidizer such as air in the fuel before it enters the thermal oxidizer; thus making this research and operation different from many other thermal oxidizers and waste-to-energy plants.The reduction in waste means less volume loads to a landfill. This process significantly reduces the amount of bio-solids to a landfill. The energy produced from the synthesis is beneficial for any municipality, as it may be used to run the waste-to-energy facility. The purpose of this study is to determine methods in which operators may configure future plants to reduce NOx emissions. NOx mixed with volatile organic compounds (VOC) and sunlight, produce ozone (O3) a deadly gas at high concentrations.This study developed a model to determine the best methods to reduce NOx emissions. Results indicate that a fuel-rich then fuel-lean injection scheme results in lower NOx emissions. This is because at fuel-rich conditions not all of the ammonia in the first air ring is converted to NOx, but rather a partial of the ammonia is converted to NOx and N2 and then the second air ring operates at fuel-lean which further oxidizes the remaining ammonia which converts to NOx, but also a fraction to N2. If NOx standards reach more stringency then aqueous ammonia injection is a recommended method for NOx reduction; this method is also known as selective non-catalytic reduction (SNCR).The findings in this study will allow operators to make better judgment in the way that they operate a two air injection scheme thermal oxidizer. The goal of the operator and the organization is to meet air quality standards and this study aims at finding ways to reduce emissions, specifically NOx.
Show less - Date Issued
- 2014
- Identifier
- CFE0005528, ucf:50301
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0005528
- Title
- LANDCOVER CHANGE AND POPULATION DYNAMICS OF FLORIDA SCRUB-JAYS AND FLORIDA GRASSHOPPER SPARROWS.
- Creator
-
Breininger, David, Noss, Reed, University of Central Florida
- Abstract / Description
-
I confronted empirical habitat data (1994-2004) and population data (1988-2005) with ecological theory on habitat dynamics, recruitment, survival, and dispersal to develop predictive relationships between landcover variation and population dynamics. I focus on Florida Scrub-Jays, although one chapter presents a model for the potential influence of habitat restoration on viability of the Florida Grasshopper Sparrow. Both species are unique to Florida landscapes that are dominated by shrubs and...
Show moreI confronted empirical habitat data (1994-2004) and population data (1988-2005) with ecological theory on habitat dynamics, recruitment, survival, and dispersal to develop predictive relationships between landcover variation and population dynamics. I focus on Florida Scrub-Jays, although one chapter presents a model for the potential influence of habitat restoration on viability of the Florida Grasshopper Sparrow. Both species are unique to Florida landscapes that are dominated by shrubs and grasses and maintained by frequent fires. Both species are declining, even in protected areas, despite their protected status. I mapped habitat for both species using grid polygon cells to quantify population potential and habitat quality. A grid cell was the average territory size and the landcover unit in which habitat-specific recruitment and survival occurred. I measured habitat-specific recruitment and survival of Florida Scrub-Jays from 1988-2008. Data analyses included multistate analysis, which was developed for capture-recapture data but is useful for analyzing many ecological processes, such as habitat change. I relied on publications by other investigators for empirical Florida Grasshopper Sparrow data. The amount of potential habitat was greatly underestimated by landcover mapping not specific to Florida Scrub-Jays. Overlaying east central Florida with grid polygons was an efficient method to map potential habitat and monitor habitat quality directly related to recruitment, survival, and management needs. Most habitats for both species were degraded by anthropogenic reductions in fire frequency. Degradation occurred across large areas. Florida Scrub-Jay recruitment and survival were most influenced by shrub height states. Multistate modeling of shrub heights showed that state transitions were influenced by vegetation composition, edges, and habitat management. Measured population declines of 4% per year corroborated habitat-specific modeling predictions. Habitat quality improved over the study period but not enough to recover precariously small populations. The degree of landcover fragmentation influenced mean Florida Scrub-Jay dispersal distances but not the number of occupied territories between natal and breeding territories. There was little exchange between populations, which were usually further apart than mean dispersal distances. Florida Scrub-Jays bred or delayed breeding depending on age, sex, and breeding opportunities. I show an urgent need also for Florida Grasshopper Sparrow habitat restoration given that the endangered bird has declined to only two sizeable populations and there is a high likelihood for continued large decline. A major effect of habitat fragmentation identified in this dissertation that should apply to many organisms in disturbance prone systems is that fragmentation disrupts natural processes, reducing habitat quality across large areas. Humans have managed wildland fire for > 40,000 years, so it should be possible to manage habitat for many endangered species that make Florida's biodiversity unique. This dissertation provides methods to quantify landscape units into potential source and sink territories and provides a basis for applying adaptive management to reach population and conservation goals.
Show less - Date Issued
- 2009
- Identifier
- CFE0002537, ucf:47638
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0002537
- Title
- DATA MINING METHODS FOR MALWARE DETECTION.
- Creator
-
Siddiqui, Muazzam, Wang, Morgan, University of Central Florida
- Abstract / Description
-
This research investigates the use of data mining methods for malware (malicious programs) detection and proposed a framework as an alternative to the traditional signature detection methods. The traditional approaches using signatures to detect malicious programs fails for the new and unknown malwares case, where signatures are not available. We present a data mining framework to detect malicious programs. We collected, analyzed and processed several thousand malicious and clean programs to...
Show moreThis research investigates the use of data mining methods for malware (malicious programs) detection and proposed a framework as an alternative to the traditional signature detection methods. The traditional approaches using signatures to detect malicious programs fails for the new and unknown malwares case, where signatures are not available. We present a data mining framework to detect malicious programs. We collected, analyzed and processed several thousand malicious and clean programs to find out the best features and build models that can classify a given program into a malware or a clean class. Our research is closely related to information retrieval and classification techniques and borrows a number of ideas from the field. We used a vector space model to represent the programs in our collection. Our data mining framework includes two separate and distinct classes of experiments. The first are the supervised learning experiments that used a dataset, consisting of several thousand malicious and clean program samples to train, validate and test, an array of classifiers. In the second class of experiments, we proposed using sequential association analysis for feature selection and automatic signature extraction. With our experiments, we were able to achieve as high as 98.4% detection rate and as low as 1.9% false positive rate on novel malwares.
Show less - Date Issued
- 2008
- Identifier
- CFE0002303, ucf:47870
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0002303
- Title
- Evolution and distribution of phenotypic diversity in the venom of Mojave Rattlesnakes (Crotalus scutulatus).
- Creator
-
Strickland, Jason, Savage, Anna, Parkinson, Christopher, Hoffman, Eric, Rokyta, Darin, University of Central Florida
- Abstract / Description
-
Intraspecific phenotype diversity allows for local adaption and the ability for species to respond to changing environmental conditions, enhancing survivability. Phenotypic variation could be stochastic, genetically based, and/or the result of different environmental conditions. Mojave Rattlesnakes, Crotalus scutulatus, are known to have high intraspecific venom variation, but the geographic extent of the variation and factors influencing venom evolution are poorly understood. Three primary...
Show moreIntraspecific phenotype diversity allows for local adaption and the ability for species to respond to changing environmental conditions, enhancing survivability. Phenotypic variation could be stochastic, genetically based, and/or the result of different environmental conditions. Mojave Rattlesnakes, Crotalus scutulatus, are known to have high intraspecific venom variation, but the geographic extent of the variation and factors influencing venom evolution are poorly understood. Three primary venom types have been described in this species based on the presence (Type A) or absence (Type B) of a neurotoxic phospholipase A2 called Mojave toxin and an inverse relationship with the presence of snake venom metalloproteinases (SVMPs). Individuals that contain both Mojave toxin and SVMPs, although rare, are the third, and designated Type A + B. I sought to describe the proteomic and transcriptomic venom diversity of C. scutulatus across its range and test whether diversity was correlated with genetic or environmental differences. This study includes the highest geographic sampling of Mojave Rattlesnakes and includes the most venom-gland transcriptomes known for one species. Of the four mitochondrial lineages known, only one was monophyletic for venom type. Environmental variables poorly correlated with the phenotypes. Variability in toxin and toxin family composition of venom transcriptomes was largely due to differences in transcript expression. Four of 19 toxin families identified in C. scutulatus account for the majority of differences in toxin number and expression variation. I was able to determine that the toxins primarily responsible for venom types are inherited in a Mendelian fashion and that toxin expression is additive when comparing heterozygotes and homozygotes. Using the genetics to define venom type is more informative and the Type A + B phenotype is not unique, but rather heterozygous for the PLA2 and/or SVMP alleles. Intraspecific venom variation in C. scutulatus highlights the need for fine scale ecological and natural history information to understand how phenotypic diversity is generated and maintained geographically through time.
Show less - Date Issued
- 2018
- Identifier
- CFE0007252, ucf:52198
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007252
- Title
- Habitat selection in transformed landscapes and the role of novel ecosystems for native species persistence.
- Creator
-
Sanchez Clavijo, Lina, Quintana-Ascencio, Pedro, Noss, Reed, Weishampel, John, Rodewald, Amanda, University of Central Florida
- Abstract / Description
-
To understand native species persistence in transformed landscapes we must evaluate how individual behaviors interact with landscape structure through ecological processes such as habitat selection. Rapid, widespread landscape transformation may lead to a mismatch between habitat preference and quality, a phenomenon known as ecological traps that can have negative outcomes for populations. I applied this framework to the study of birds inhabiting landscapes dominated by forest remnants and...
Show moreTo understand native species persistence in transformed landscapes we must evaluate how individual behaviors interact with landscape structure through ecological processes such as habitat selection. Rapid, widespread landscape transformation may lead to a mismatch between habitat preference and quality, a phenomenon known as ecological traps that can have negative outcomes for populations. I applied this framework to the study of birds inhabiting landscapes dominated by forest remnants and shade coffee plantations, a tropical agroforestry system that retains important portions of native biodiversity. I used two different approaches to answer the question: What is the role of habitat selection in the adaptation of native species to transformed landscapes? First, I present the results of a simulation model used to evaluate the effects of landscape structure on population dynamics of a hypothetical species under two mechanisms of habitat selection. Then I present the analyses of seven years of capture-mark-recapture and resight data collected to compare habitat preference and quality between shade coffee and forest for twelve resident bird species in the Sierra Nevada de Santa Marta (Colombia). I provide evidence for the importance of including the landscape context in the evaluation of ecological traps and for using long-term demographic data when evaluating the potential of novel ecosystems and intermediately-modified habitats for biodiversity conservation. Beyond suggestions to improve bird conservation in shade coffee, my findings contribute to theory about ecological traps and can be applied to understand population processes in a wide variety of heterogeneous landscapes.
Show less - Date Issued
- 2016
- Identifier
- CFE0006494, ucf:51392
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006494
- Title
- Characterization of Anisotropic Mechanical Performance of As-Built Additively Manufactured Metals.
- Creator
-
Siddiqui, Sanna, Gordon, Ali, Raghavan, Seetha, Bai, Yuanli, Sohn, Yongho, University of Central Florida
- Abstract / Description
-
Additive manufacturing (AM) technologies use a 3D Computer Aided Design (CAD) model to develop a component through a deposition and fusion layer process, allowing for rapid design and geometric flexibility of metal components, for use in the aerospace, energy and biomedical industries. Challenges exist with additive manufacturing that limits its replacement of conventional manufacturing techniques, most especially a comprehensive understanding of the anisotropic behavior of these materials...
Show moreAdditive manufacturing (AM) technologies use a 3D Computer Aided Design (CAD) model to develop a component through a deposition and fusion layer process, allowing for rapid design and geometric flexibility of metal components, for use in the aerospace, energy and biomedical industries. Challenges exist with additive manufacturing that limits its replacement of conventional manufacturing techniques, most especially a comprehensive understanding of the anisotropic behavior of these materials and how it is reflected in observed tensile, torsional and fatigue mechanical responses. As such, there is a need to understand how the build orientation of as-built additively manufactured metals, affects mechanical performance (e.g. monotonic and cyclic behavior, cyclically hardening/softening behavior, plasticity effects on fatigue life etc.); and to use constitutive modeling to both support experimental findings, and provide approximations of expected behavior (e.g. failure surfaces, monotonic and cyclic response, correlations between tensile and fatigue properties), for orientations and experiments not tested, due to the expensive cost associated with AM. A comprehensive framework has been developed to characterize the anisotropic behavior of as-built additively manufactured metals (i.e. Stainless Steel GP1 (SS GP1), similar in chemical composition to Stainless Steel 17-4PH), through a series of mechanical testing, microscopic evaluation and constitutive modeling, which were used to identify a reduced specimen size for characterizing these materials. An analysis of the torsional response of additively manufactured Inconel 718 has been performed to assess the impact of build orientation and as-built conditions on the shearing behavior of this material. Experimental results from DMLS SS GP1 and AM Inconel 718 from literature were used to constitutively model the material responses of these additively manufactured metals. Overall, this framework has been designed to serve as standard, from which build orientation selection can be used to meet specific desired industry requirements.
Show less - Date Issued
- 2018
- Identifier
- CFE0007097, ucf:52883
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007097