You are here

MODIFICATIONS TO THE FUZZY-ARTMAP ALGORITHM FOR DISTRIBUTED LEARNING IN LARGE DATA SETS

Download pdf | Full Screen View

Date Issued:
2004
Abstract/Description:
The Fuzzy-ARTMAP (FAM) algorithm is one of the premier neural network architectures for classification problems. FAM can learn on line and is usually faster than other neural network approaches. Nevertheless the learning time of FAM can slow down considerably when the size of the training set increases into the hundreds of thousands. We apply data partitioning and networkpartitioning to the FAM algorithm in a sequential and parallel settingto achieve better convergence time and to efficiently train withlarge databases (hundreds of thousands of patterns).Our parallelization is implemented on a Beowulf clusters of workstations. Two data partitioning approaches and two networkpartitioning approaches are developed. Extensive testing of all the approaches is done on three large datasets (half a milliondata points). One of them is the Forest Covertype database from Blackard and the other two are artificially generated Gaussian data with different percentages of overlap between classes.Speedups in the data partitioning approach reached the order of the hundreds without having to invest in parallel computation. Speedups onthe network partitioning approach are close to linear on a cluster of workstations. Both methods allowed us to reduce the computation time of training the neural network in large databases from days to minutes. We prove formally that the workload balance of our network partitioning approaches will never be worse than an acceptable bound, and also demonstrate the correctness of these parallelization variants of FAM.
Title: MODIFICATIONS TO THE FUZZY-ARTMAP ALGORITHM FOR DISTRIBUTED LEARNING IN LARGE DATA SETS.
40 views
23 downloads
Name(s): Castro, Jose R, Author
Georgiopoulos, Michael, Committee Chair
University of Central Florida, Degree Grantor
Type of Resource: text
Date Issued: 2004
Publisher: University of Central Florida
Language(s): English
Abstract/Description: The Fuzzy-ARTMAP (FAM) algorithm is one of the premier neural network architectures for classification problems. FAM can learn on line and is usually faster than other neural network approaches. Nevertheless the learning time of FAM can slow down considerably when the size of the training set increases into the hundreds of thousands. We apply data partitioning and networkpartitioning to the FAM algorithm in a sequential and parallel settingto achieve better convergence time and to efficiently train withlarge databases (hundreds of thousands of patterns).Our parallelization is implemented on a Beowulf clusters of workstations. Two data partitioning approaches and two networkpartitioning approaches are developed. Extensive testing of all the approaches is done on three large datasets (half a milliondata points). One of them is the Forest Covertype database from Blackard and the other two are artificially generated Gaussian data with different percentages of overlap between classes.Speedups in the data partitioning approach reached the order of the hundreds without having to invest in parallel computation. Speedups onthe network partitioning approach are close to linear on a cluster of workstations. Both methods allowed us to reduce the computation time of training the neural network in large databases from days to minutes. We prove formally that the workload balance of our network partitioning approaches will never be worse than an acceptable bound, and also demonstrate the correctness of these parallelization variants of FAM.
Identifier: CFE0000065 (IID), ucf:46092 (fedora)
Note(s): 2004-05-01
Ph.D.
College of Engineering and Computer Science, Department of Electrical and Computer Engineering
This record was generated from author submitted information.
Subject(s): Fuzzy-ARTMAP
Neural Network
Parallel processing
Beowulf
Data mining
Persistent Link to This Record: http://purl.flvc.org/ucf/fd/CFE0000065
Restrictions on Access: public
Host Institution: UCF

In Collections