Current Search: random forest (x)
View All Items
- Title
- Remote Sensing of Coastal Wetlands: Long term vegetation stress assessment and data enhancement technique.
- Creator
-
Tahsin, Subrina, Medeiros, Stephen, Singh, Arvind, Mayo, Talea, University of Central Florida
- Abstract / Description
-
Apalachicola Bay in the Florida panhandle is home to a rich variety of salt water and freshwater wetlands but unfortunately is also subject to a wide range of hydrologic extreme events. Extreme hydrologic events such as hurricanes and droughts continuously threaten the area. The impact of hurricane and drought on both fresh and salt water wetlands was investigated over the time period from 2000 to 2015 in Apalachicola Bay using spatio-temporal changes in the Landsat based NDVI. Results...
Show moreApalachicola Bay in the Florida panhandle is home to a rich variety of salt water and freshwater wetlands but unfortunately is also subject to a wide range of hydrologic extreme events. Extreme hydrologic events such as hurricanes and droughts continuously threaten the area. The impact of hurricane and drought on both fresh and salt water wetlands was investigated over the time period from 2000 to 2015 in Apalachicola Bay using spatio-temporal changes in the Landsat based NDVI. Results indicate that salt water wetlands were more resilient than fresh water wetlands. Results also suggest that in response to hurricanes, the coastal wetlands took almost a year to recover while recovery following a drought period was observed after only a month. This analysis was successful and provided excellent insights into coastal wetland health. Such long term study is heavily dependent on optical sensor that is subject to data loss due to cloud coverage. Therefore, a novel method is proposed and demonstrated to recover the information contaminated by cloud. Cloud contamination is a hindrance to long-term environmental assessment using information derived from satellite imagery that retrieve data from visible and infrared spectral ranges. Normalized Difference Vegetation Index (NDVI) is a widely used index to monitor vegetation and land use change. NDVI can be retrieved from publicly available data repositories of optical sensors such as Landsat, Moderate Resolution Imaging Spectro-radiometer (MODIS) and several commercial satellites. Landsat has an ongoing high resolution NDVI record starting from 1984. Unfortunately, the time series NDVI data suffers from the cloud contamination issue. Though simple to complex computational methods for data interpolation have been applied to recover cloudy data, all the techniques are subject to many limitations. In this paper, a novel Optical Cloud Pixel Recovery (OCPR) method is proposed to repair cloudy pixels from the time-space-spectrum continuum with the aid of a machine learning tool, namely random forest (RF) trained and tested utilizing multi-parameter hydrologic data. The RF based OCPR model was compared with a simple linear regression (LR) based OCPR model to understand the potential of the model. A case study in Apalachicola Bay is presented to evaluate the performance of OCPR to repair cloudy NDVI reflectance for two specific dates. The RF based OCPR method achieves a root mean squared error of 0.0475 sr?1 between predicted and observed NDVI reflectance values. The LR based OCPR method achieves a root mean squared error of 0.1257 sr?1. Findings suggested that the RF based OCPR method is effective to repair cloudy values and provide continuous and quantitatively reliable imagery for further analysis in environmental applications.
Show less - Date Issued
- 2016
- Identifier
- CFE0006546, ucf:51331
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006546
- Title
- A deep learning approach to diagnosing schizophrenia.
- Creator
-
Barry, Justin, Valliyil Thankachan, Sharma, Gurupur, Varadraj, Jha, Sumit Kumar, Ewetz, Rickard, University of Central Florida
- Abstract / Description
-
In this article, the investigators present a new method using a deep learning approach to diagnose schizophrenia. In the experiment presented, the investigators have used a secondary dataset provided by National Institutes of Health. The aforementioned experimentation involves analyzing this dataset for existence of schizophrenia using traditional machine learning approaches such as logistic regression, support vector machine, and random forest. This is followed by application of deep...
Show moreIn this article, the investigators present a new method using a deep learning approach to diagnose schizophrenia. In the experiment presented, the investigators have used a secondary dataset provided by National Institutes of Health. The aforementioned experimentation involves analyzing this dataset for existence of schizophrenia using traditional machine learning approaches such as logistic regression, support vector machine, and random forest. This is followed by application of deep learning techniques using three hidden layers in the model. The results obtained indicate that deep learning provides state-of-the-art accuracy in diagnosing schizophrenia. Based on these observations, there is a possibility that deep learning may provide a paradigm shift in diagnosing schizophrenia.
Show less - Date Issued
- 2019
- Identifier
- CFE0007429, ucf:52737
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007429
- Title
- USING LANDSCAPE GENETICS TO ASSESS POPULATION CONNECTIVITY IN A HABITAT GENERALIST.
- Creator
-
Hether, Tyler, Hoffman, Eric, University of Central Florida
- Abstract / Description
-
Understanding the nature of genetic variation in natural populations is an underlying theme of population genetics. In recent years population genetics has benefited from the incorporation of landscape and environmental data into pre-existing models of isolation by distance (IBD) to elucidate features influencing spatial genetic variation. Many of these landscape genetics studies have focused on populations separated by discrete barriers (e.g., mountain ridges) or species with specific...
Show moreUnderstanding the nature of genetic variation in natural populations is an underlying theme of population genetics. In recent years population genetics has benefited from the incorporation of landscape and environmental data into pre-existing models of isolation by distance (IBD) to elucidate features influencing spatial genetic variation. Many of these landscape genetics studies have focused on populations separated by discrete barriers (e.g., mountain ridges) or species with specific habitat requirements (i.e., habitat specialists). One difficulty in using a landscape genetics approach for taxa with less stringent habitat requirements (i.e., generalists) is the lack of obvious barriers to gene flow and preference for specific habitats. My study attempts to fill this information gap to understand mechanisms underlying population subdivision in generalists, using the squirrel treefrog (Hyla squirella) and a system for classifying 'terrestrial ecological systems' (i.e. habitat types). I evaluate this dataset with microsatellite markers and a recently introduced method based on ensemble learning (Random Forest) to identify whether spatial distance, habitat types, or both have influenced genetic connectivity among 20 H. squirella populations. Next, I hierarchically subset the populations included in the analysis based on (1) genetic assignment tests and (2) Mantel correlograms to determine the relative role of spatial distance in shaping landscape genetic patterns. Assignment tests show evidence of two genetic clusters that separate populations in Florida's panhandle (Western cluster) from those in peninsular Florida and southern Georgia (Eastern cluster). Mantel correlograms suggest a patch size of approximately 150 km. Landscape genetic analyses at all three spatial scales yielded improved model fit relative to isolation by distance when including habitat types. A hierarchical effect was identified whereby the importance of spatial distance (km) was the strongest predictor of patterns of genetic differentiation above the scale of the genetic patch. Below the genetic patch, spatial distance was still an explanatory variable but was only approximately 30% as relevant as mesic flatwoods or upland oak hammocks. Thus, it appears that habitat types largely influence patterns of population genetic connectivity at local scales but the signal of IBD becomes the dominant driver of regional connectivity. My results highlight some habitats as highly relevant to increased genetic connectivity at all spatial scales (e.g., upland oak hammocks) while others show no association (e.g., silviculture) or scale specific associations (e.g., pastures only at global scales). Given these results it appears that treating habitat as a binary metric (suitable/non-suitable) may be overly simplistic for generalist species in which gene flow probably occurs in a spectrum of habitat suitability. The overall pattern of spatial genetic and landscape genetic structure identified here provides insight into the evolutionary history and patterns of population connectivity for H. squirella and improves our understanding of the role of matrix composition for habitat generalists.
Show less - Date Issued
- 2010
- Identifier
- CFE0003204, ucf:48580
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0003204
- Title
- ANALYSES OF CRASH OCCURENCE AND INURY SEVERITIES ON MULTI LANE HIGHWAYS USING MACHINE LEARNING ALGORITHMS.
- Creator
-
Das, Abhishek, Abdel-Aty, Mohamed A., University of Central Florida
- Abstract / Description
-
Reduction of crash occurrence on the various roadway locations (mid-block segments; signalized intersections; un-signalized intersections) and the mitigation of injury severity in the event of a crash are the major concerns of transportation safety engineers. Multi lane arterial roadways (excluding freeways and expressways) account for forty-three percent of fatal crashes in the state of Florida. Significant contributing causes fall under the broad categories of aggressive driver behavior;...
Show moreReduction of crash occurrence on the various roadway locations (mid-block segments; signalized intersections; un-signalized intersections) and the mitigation of injury severity in the event of a crash are the major concerns of transportation safety engineers. Multi lane arterial roadways (excluding freeways and expressways) account for forty-three percent of fatal crashes in the state of Florida. Significant contributing causes fall under the broad categories of aggressive driver behavior; adverse weather and environmental conditions; and roadway geometric and traffic factors. The objective of this research was the implementation of innovative, state-of-the-art analytical methods to identify the contributing factors for crashes and injury severity. Advances in computational methods render the use of modern statistical and machine learning algorithms. Even though most of the contributing factors are known a-priori, advanced methods unearth changing trends. Heuristic evolutionary processes such as genetic programming; sophisticated data mining methods like conditional inference tree; and mathematical treatments in the form of sensitivity analyses outline the major contributions in this research. Application of traditional statistical methods like simultaneous ordered probit models, identification and resolution of crash data problems are also key aspects of this study. In order to eliminate the use of unrealistic uniform intersection influence radius of 250 ft, heuristic rules were developed for assigning crashes to roadway segments, signalized intersection and access points using parameters, such as 'site location', 'traffic control' and node information. Use of Conditional Inference Forest instead of Classification and Regression Tree to identify variables of significance for injury severity analysis removed the bias towards the selection of continuous variable or variables with large number of categories. For the injury severity analysis of crashes on highways, the corridors were clustered into four optimum groups. The optimum number of clusters was found using Partitioning around Medoids algorithm. Concepts of evolutionary biology like crossover and mutation were implemented to develop models for classification and regression analyses based on the highest hit rate and minimum error rate, respectively. Low crossover rate and higher mutation reduces the chances of genetic drift and brings in novelty to the model development process. Annual daily traffic; friction coefficient of pavements; on-street parking; curbed medians; surface and shoulder widths; alcohol / drug usage are some of the significant factors that played a role in both crash occurrence and injury severities. Relative sensitivity analyses were used to identify the effect of continuous variables on the variation of crash counts. This study improved the understanding of the significant factors that could play an important role in designing better safety countermeasures on multi lane highways, and hence enhance their safety by reducing the frequency of crashes and severity of injuries. Educating young people about the abuses of alcohol and drugs specifically at high schools and colleges could potentially lead to lower driver aggression. Removal of on-street parking from high speed arterials unilaterally could result in likely drop in the number of crashes. Widening of shoulders could give greater maneuvering space for the drivers. Improving pavement conditions for better friction coefficient will lead to improved crash recovery. Addition of lanes to alleviate problems arising out of increased ADT and restriction of trucks to the slower right lanes on the highways would not only reduce the crash occurrences but also resulted in lower injury severity levels.
Show less - Date Issued
- 2009
- Identifier
- CFE0002928, ucf:48007
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0002928
- Title
- Applying Machine Learning Techniques to Analyze the Pedestrian and Bicycle Crashes at the Macroscopic Level.
- Creator
-
Rahman, Md Sharikur, Abdel-Aty, Mohamed, Eluru, Naveen, Hasan, Samiul, Yan, Xin, University of Central Florida
- Abstract / Description
-
This thesis presents different data mining/machine learning techniques to analyze the vulnerable road users' (i.e., pedestrian and bicycle) crashes by developing crash prediction models at macro-level. In this study, we developed data mining approach (i.e., decision tree regression (DTR) models) for both pedestrian and bicycle crash counts. To author knowledge, this is the first application of DTR models in the growing traffic safety literature at macro-level. The empirical analysis is based...
Show moreThis thesis presents different data mining/machine learning techniques to analyze the vulnerable road users' (i.e., pedestrian and bicycle) crashes by developing crash prediction models at macro-level. In this study, we developed data mining approach (i.e., decision tree regression (DTR) models) for both pedestrian and bicycle crash counts. To author knowledge, this is the first application of DTR models in the growing traffic safety literature at macro-level. The empirical analysis is based on the Statewide Traffic Analysis Zones (STAZ) level crash count data for both pedestrian and bicycle from the state of Florida for the year of 2010 to 2012. The model results highlight the most significant predictor variables for pedestrian and bicycle crash count in terms of three broad categories: traffic, roadway, and socio demographic characteristics. Furthermore, spatial predictor variables of neighboring STAZ were utilized along with the targeted STAZ variables in order to improve the prediction accuracy of both DTR models. The DTR model considering spatial predictor variables (spatial DTR model) were compared without considering spatial predictor variables (aspatial DTR model) and the models comparison results clearly found that spatial DTR model is superior model compared to aspatial DTR model in terms of prediction accuracy. Finally, this study contributed to the safety literature by applying three ensemble techniques (Bagging, Random Forest, and Boosting) in order to improve the prediction accuracy of weak learner (DTR models) for macro-level crash count. The model's estimation result revealed that all the ensemble technique performed better than the DTR model and the gradient boosting technique outperformed other competing ensemble technique in macro-level crash prediction model.
Show less - Date Issued
- 2018
- Identifier
- CFE0007358, ucf:52103
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007358
- Title
- Learning to Grasp Unknown Objects using Weighted Random Forest Algorithm from Selective Image and Point Cloud Feature.
- Creator
-
Iqbal, Md Shahriar, Behal, Aman, Boloni, Ladislau, Haralambous, Michael, University of Central Florida
- Abstract / Description
-
This method demonstrates an approach to determine the best grasping location on an unknown object using Weighted Random Forest Algorithm. It used RGB-D value of an object as input to find a suitable rectangular grasping region as the output. To accomplish this task, it uses a subspace of most important features from a very high dimensional extensive feature space that contains both image and point cloud features. Usage of most important features in the grasping algorithm has enabled the...
Show moreThis method demonstrates an approach to determine the best grasping location on an unknown object using Weighted Random Forest Algorithm. It used RGB-D value of an object as input to find a suitable rectangular grasping region as the output. To accomplish this task, it uses a subspace of most important features from a very high dimensional extensive feature space that contains both image and point cloud features. Usage of most important features in the grasping algorithm has enabled the system to be computationally very fast while preserving maximum information gain. In this approach, the Random Forest operates using optimum parameters e.g. Number of Trees, Number of Features at each node, Information Gain Criteria etc. ensures optimization in learning, with highest possible accuracy in minimum time in an advanced practical setting. The Weighted Random Forest chosen over Support Vector Machine (SVM), Decision Tree and Adaboost for implementation of the grasping system outperforms the stated machine learning algorithms both in training and testing accuracy and other performance estimates. The Grasping System utilizing learning from a score function detects the rectangular grasping region after selecting the top rectangle that has the largest score. The system is implemented and tested in a Baxter Research Robot with Parallel Plate Gripper in action.
Show less - Date Issued
- 2014
- Identifier
- CFE0005509, ucf:50358
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0005509