You are here

EXPLORING TECHNIQUES FOR MEASUREMENT AND IMPROVEMENT OF DATA QUALITY WITH APPLICATION TO DETERMINATION OF THE LAST KNOWN POSITION (LKP) IN SEARCH AND RESCUE (SAR) DATA

Download pdf | Full Screen View

Date Issued:
2011
Abstract/Description:
There is a tremendous volume of data being generated in today's world. As organizations around the globe realize the increased importance of their data as being a valuable asset in gaining a competitive edge in a fast-paced and a dynamic business world, more and more attention is being paid to the quality of the data. Advances in the fields of data mining, predictive modeling, text mining, web mining, business intelligence, health care analytics, etc. all depend on clean, accurate data. That one cannot effectively mine data, which is dirty, comes as no surprise. This research is an exploratory study of different domain data sets, addressing the data quality issues specific to each domain, identifying the challenges faced and arriving at techniques or methodologies for measuring and improving the data quality. The primary focus of the research is on the SAR or Search and Rescue dataset, identifying key issues related to data quality therein and developing an algorithm for improving the data quality. SAR missions which are routinely conducted all over the world show a trend of increasing mission costs. Retrospective studies of historic SAR data not only allow for a detailed analysis and understanding of SAR incidents and patterns, but also form the basis for generating probability maps, analytical data models, etc., which allow for an efficient use of valuable SAR resources and their distribution. One of the challenges with regards to the SAR dataset is that the collection process is not perfect. Often, the LKP or the Last Known Position is not known or cannot be arrived at. The goal is to fully or partially geocode the LKP for as many data points as possible, identify those data points where the LKP cannot be geocoded at all, and further highlight the underlying data quality issues. The SAR Algorithm has been developed, which makes use of partial or incomplete information, cleans and validates the data, and further extracts address information from relevant fields to successfully geocode the data. The algorithm improves the geocoding accuracy and has been validated by a set of approaches.
Title: EXPLORING TECHNIQUES FOR MEASUREMENT AND IMPROVEMENT OF DATA QUALITY WITH APPLICATION TO DETERMINATION OF THE LAST KNOWN POSITION (LKP) IN SEARCH AND RESCUE (SAR) DATA.
110 views
16 downloads
Name(s): Wakchaure, Abhijit, Author
Hua, Kien, Committee Chair
University of Central Florida, Degree Grantor
Type of Resource: text
Date Issued: 2011
Publisher: University of Central Florida
Language(s): English
Abstract/Description: There is a tremendous volume of data being generated in today's world. As organizations around the globe realize the increased importance of their data as being a valuable asset in gaining a competitive edge in a fast-paced and a dynamic business world, more and more attention is being paid to the quality of the data. Advances in the fields of data mining, predictive modeling, text mining, web mining, business intelligence, health care analytics, etc. all depend on clean, accurate data. That one cannot effectively mine data, which is dirty, comes as no surprise. This research is an exploratory study of different domain data sets, addressing the data quality issues specific to each domain, identifying the challenges faced and arriving at techniques or methodologies for measuring and improving the data quality. The primary focus of the research is on the SAR or Search and Rescue dataset, identifying key issues related to data quality therein and developing an algorithm for improving the data quality. SAR missions which are routinely conducted all over the world show a trend of increasing mission costs. Retrospective studies of historic SAR data not only allow for a detailed analysis and understanding of SAR incidents and patterns, but also form the basis for generating probability maps, analytical data models, etc., which allow for an efficient use of valuable SAR resources and their distribution. One of the challenges with regards to the SAR dataset is that the collection process is not perfect. Often, the LKP or the Last Known Position is not known or cannot be arrived at. The goal is to fully or partially geocode the LKP for as many data points as possible, identify those data points where the LKP cannot be geocoded at all, and further highlight the underlying data quality issues. The SAR Algorithm has been developed, which makes use of partial or incomplete information, cleans and validates the data, and further extracts address information from relevant fields to successfully geocode the data. The algorithm improves the geocoding accuracy and has been validated by a set of approaches.
Identifier: CFE0004050 (IID), ucf:49142 (fedora)
Note(s): 2011-08-01
Ph.D.
Engineering and Computer Science, School of Electrical Engineering and Computer Science
Doctorate
This record was generated from author submitted information.
Subject(s): Data quality
data engineering
search and rescue
Persistent Link to This Record: http://purl.flvc.org/ucf/fd/CFE0004050
Restrictions on Access: public
Host Institution: UCF

In Collections