Current Search: multi-dimensional (x)
View All Items
- Title
- AUTONOMOUS REPAIR OF OPTICAL CHARACTER RECOGNITION DATA THROUGH SIMPLE VOTING AND MULTI-DIMENSIONAL INDEXING TECHNIQUES.
- Creator
-
Sprague, Christopher, Weeks, Arthur, University of Central Florida
- Abstract / Description
-
The three major optical character recognition (OCR) engines (ExperVision, Scansoft OCR, and Abby OCR) in use today are all capable of recognizing text at near perfect percentages. The remaining errors however have proven very difficult to identify within a single engine. Recent research has shown that a comparison between the errors of the three engines proved to have very little correlation, and thus, when used in conjunction, may be useful to increase accuracy of the final result. This...
Show moreThe three major optical character recognition (OCR) engines (ExperVision, Scansoft OCR, and Abby OCR) in use today are all capable of recognizing text at near perfect percentages. The remaining errors however have proven very difficult to identify within a single engine. Recent research has shown that a comparison between the errors of the three engines proved to have very little correlation, and thus, when used in conjunction, may be useful to increase accuracy of the final result. This document discusses the implementation and results of a simple voting system designed to prove the hypothesis and show a statistical improvement in overall accuracy. Additional aspects of implementing an improved OCR scheme such as dealing with multiple engine data output alignment and recognizing application specific solutions are also addressed in this research. Although voting systems are currently in use by many major OCR engine developers, this research focuses on the addition of a collaborative system which is able to utilize the various positive aspects of multiple engines while also addressing the immediate need for practical industry applications such as litigation and forms processing. Doculex TM, a major developer and leader in the document imaging industry, has provided the funding for this research.
Show less - Date Issued
- 2005
- Identifier
- CFE0000380, ucf:46337
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0000380
- Title
- Learning Kernel-based Approximate Isometries.
- Creator
-
Sedghi, Mahlagha, Georgiopoulos, Michael, Anagnostopoulos, Georgios, Atia, George, Liu, Fei, University of Central Florida
- Abstract / Description
-
The increasing availability of public datasets offers an inexperienced opportunity to conduct data-driven studies. Metric Multi-Dimensional Scaling aims to find a low-dimensional embedding of the data, preserving the pairwise dissimilarities amongst the data points in the original space. Along with the visualizability, this dimensionality reduction plays a pivotal role in analyzing and disclosing the hidden structures in the data. This work introduces Sparse Kernel-based Least Squares Multi...
Show moreThe increasing availability of public datasets offers an inexperienced opportunity to conduct data-driven studies. Metric Multi-Dimensional Scaling aims to find a low-dimensional embedding of the data, preserving the pairwise dissimilarities amongst the data points in the original space. Along with the visualizability, this dimensionality reduction plays a pivotal role in analyzing and disclosing the hidden structures in the data. This work introduces Sparse Kernel-based Least Squares Multi-Dimensional Scaling approach for exploratory data analysis and, when desirable, data visualization. We assume our embedding map belongs to a Reproducing Kernel Hilbert Space of vector-valued functions which allows for embeddings of previously unseen data. Also, given appropriate positive-definite kernel functions, it extends the applicability of our methodto non-numerical data. Furthermore, the framework employs Multiple Kernel Learning for implicitlyidentifying an effective feature map and, hence, kernel function. Finally, via the use ofsparsity-promoting regularizers, the technique is capable of embedding data on a, typically, lowerdimensionalmanifold by naturally inferring the embedding dimension from the data itself. In theprocess, key training samples are identified, whose participation in the embedding map's kernelexpansion is most influential. As we will show, such influence may be given interesting interpretations in the context of the data at hand. The resulting multi-kernel learning, non-convex framework can be effectively trained via a block coordinate descent approach, which alternates between an accelerated proximal average method-based iterative majorization for learning the kernel expansion coefficients and a simple quadratic program, which deduces the multiple-kernel learning coefficients. Experimental results showcase potential uses of the proposed framework on artificial data as well as real-world datasets, that underline the merits of our embedding framework. Our method discovers genuine hidden structure in the data, that in case of network data, matches the results of well-known Multi- level Modularity Optimization community structure detection algorithm.
Show less - Date Issued
- 2017
- Identifier
- CFE0007132, ucf:52315
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007132
- Title
- NEW HEURISTICS FOR THE 0-1 MULTI-DIMENSIONAL KNAPSACK PROBLEMS.
- Creator
-
Akin, Haluk, Sepulveda, Jose, University of Central Florida
- Abstract / Description
-
This dissertation introduces new heuristic methods for the 0-1 multi-dimensional knapsack problem (0-1 MKP). 0-1 MKP can be informally stated as the problem of packing items into a knapsack while staying within the limits of different constraints (dimensions). Each item has a profit level assigned to it. They can be, for instance, the maximum weight that can be carried, the maximum available volume, or the maximum amount that can be afforded for the items. One main assumption is that we have...
Show moreThis dissertation introduces new heuristic methods for the 0-1 multi-dimensional knapsack problem (0-1 MKP). 0-1 MKP can be informally stated as the problem of packing items into a knapsack while staying within the limits of different constraints (dimensions). Each item has a profit level assigned to it. They can be, for instance, the maximum weight that can be carried, the maximum available volume, or the maximum amount that can be afforded for the items. One main assumption is that we have only one item of each type, hence the problem is binary (0-1). The single dimensional version of the 0-1 MKP is the uni-dimensional single knapsack problem which can be solved in pseudo-polynomial time. However the 0-1 MKP is a strongly NP-Hard problem. Reduced cost values are rarely used resources in 0-1 MKP heuristics; using reduced cost information we introduce several new heuristics and also some improvements to past heuristics. We introduce two new ordering strategies, decision variable importance (DVI) and reduced cost based ordering (RCBO). We also introduce a new greedy heuristic concept which we call the "sliding concept" and a sub-branch of the "sliding concept" which we call "sliding enumeration". We again use the reduced cost values within the sliding enumeration heuristic. RCBO is a brand new ordering strategy which proved useful in several methods such as improving Pirkul's MKHEUR, a triangular distribution based probabilistic approach, and our own sliding enumeration. We show how Pirkul's shadow price based ordering strategy fails to order the partial variables. We present a possible fix to this problem since there tends to be a high number of partial variables in hard problems. Therefore, this insight will help future researchers solve hard problems with more success. Even though sliding enumeration is a trivial method it found optima in less than a few seconds for most of our problems. We present different levels of sliding enumeration and discuss potential improvements to the method. Finally, we also show that in meta-heuristic approaches such as Drexl's simulated annealing where random numbers are abundantly used, it would be better to use better designed probability distributions instead of random numbers.
Show less - Date Issued
- 2009
- Identifier
- CFE0002633, ucf:48195
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0002633