You are here

A COMPETITIVE RECONFIGURATION APPROACH TO AUTONOMOUS FAULT HANDLING USING GENETIC ALGORITHMS

Download pdf | Full Screen View

Date Issued:
2008
Abstract/Description:
In this dissertation, a novel self-repair approach based on Consensus Based Evaluation (CBE) for autonomous repair of SRAM-based Field Programmable Gate Arrays (FPGAs) is developed, evaluated, and refined. An initial population of functionally identical (same input-output behavior), yet physically distinct (alternative design or place-and-route realization) FPGA configurations is produced at design time. During run-time, the CBE approach ranks these alternative configurations after evaluating their discrepancy relative to the consensus formed by the population. Through runtime competition, faults in the logical resources become occluded from the visibility of subsequent FPGA operations. Meanwhile, offspring formed through crossover and mutation of faulty and viable configurations are selected at a controlled re-introduction rate for evaluation and refurbishment. Refurbishments are evolved in-situ, with online real-time input-based performance evaluation, enhancing system availability and sustainability, creating an Organic Embedded System (OES). A fault tolerance model called N Modular Redundancy with Standby (NMRSB) is developed which combines the two popular fault tolerance techniques of NMR and Standby fault tolerance in order to facilitate the CBE approach. This dissertation develops two of instances of the NMRSB system – Triple Modular Redundancy with Standby (TMRSB) and Duplex with Standby (DSB). A hypothetical Xilinx Virtex-II Pro FPGA model demonstrates their viability for various applications including a 3-bit x 3-bit multiplier, and the MCNC91 benchmark circuits. Experiments conducted on the model iii evaluate the performance of three new genetic operators and demonstrate progress towards a completely self-contained single-chip implementation so that the FPGA can refurbish itself without requiring a PC host to execute the Genetic Algorithm. This dissertation presents results from the simulations of multiple applications with a CBE model implemented in the C++ programming language. Starting with an initial population of 20 and 30 viable configurations for TMRSB and DSB respectively, a single stuck-at fault is introduced in the logic resources. Fault refurbishment experiments are conducted under supervision of CBE using a fitness state evaluation function based on competing outputs, fitness adjustment, and different level threshold. The device remains online throughout the process by which a complete repair is realized with Hamming Distance and Bitweight voting schemes. The results indicate a Hamming Distance TMRSB approach can prevent the most pervasive fault impacts and realize complete refurbishment. Experimental results also show that the Autonomic Layer demonstrates 100% faulty component isolation for both Functional Elements (FEs) and Autonomous Elements (AEs) with randomly injected single and multiple faults. Using logic circuits from the MCNC-91 benchmark set, availability during repair phases averaged 75.05%, 82.21%, and 65.21% for the z4ml, cm85a, and cm138a circuits respectively under stated conditions. In addition to simulation, the proposed OES architecture synthesized from HDL was prototyped on a Xilinx Virtex II Pro FPGA device supporting partial reconfiguration to demonstrate the feasibility for intrinsic regeneration of the selected circuit.
Title: A COMPETITIVE RECONFIGURATION APPROACH TO AUTONOMOUS FAULT HANDLING USING GENETIC ALGORITHMS.
36 views
16 downloads
Name(s): Zhang, Kening, Author
DeMara, Ronald F, Committee Chair
University of Central Florida, Degree Grantor
Type of Resource: text
Date Issued: 2008
Publisher: University of Central Florida
Language(s): English
Abstract/Description: In this dissertation, a novel self-repair approach based on Consensus Based Evaluation (CBE) for autonomous repair of SRAM-based Field Programmable Gate Arrays (FPGAs) is developed, evaluated, and refined. An initial population of functionally identical (same input-output behavior), yet physically distinct (alternative design or place-and-route realization) FPGA configurations is produced at design time. During run-time, the CBE approach ranks these alternative configurations after evaluating their discrepancy relative to the consensus formed by the population. Through runtime competition, faults in the logical resources become occluded from the visibility of subsequent FPGA operations. Meanwhile, offspring formed through crossover and mutation of faulty and viable configurations are selected at a controlled re-introduction rate for evaluation and refurbishment. Refurbishments are evolved in-situ, with online real-time input-based performance evaluation, enhancing system availability and sustainability, creating an Organic Embedded System (OES). A fault tolerance model called N Modular Redundancy with Standby (NMRSB) is developed which combines the two popular fault tolerance techniques of NMR and Standby fault tolerance in order to facilitate the CBE approach. This dissertation develops two of instances of the NMRSB system – Triple Modular Redundancy with Standby (TMRSB) and Duplex with Standby (DSB). A hypothetical Xilinx Virtex-II Pro FPGA model demonstrates their viability for various applications including a 3-bit x 3-bit multiplier, and the MCNC91 benchmark circuits. Experiments conducted on the model iii evaluate the performance of three new genetic operators and demonstrate progress towards a completely self-contained single-chip implementation so that the FPGA can refurbish itself without requiring a PC host to execute the Genetic Algorithm. This dissertation presents results from the simulations of multiple applications with a CBE model implemented in the C++ programming language. Starting with an initial population of 20 and 30 viable configurations for TMRSB and DSB respectively, a single stuck-at fault is introduced in the logic resources. Fault refurbishment experiments are conducted under supervision of CBE using a fitness state evaluation function based on competing outputs, fitness adjustment, and different level threshold. The device remains online throughout the process by which a complete repair is realized with Hamming Distance and Bitweight voting schemes. The results indicate a Hamming Distance TMRSB approach can prevent the most pervasive fault impacts and realize complete refurbishment. Experimental results also show that the Autonomic Layer demonstrates 100% faulty component isolation for both Functional Elements (FEs) and Autonomous Elements (AEs) with randomly injected single and multiple faults. Using logic circuits from the MCNC-91 benchmark set, availability during repair phases averaged 75.05%, 82.21%, and 65.21% for the z4ml, cm85a, and cm138a circuits respectively under stated conditions. In addition to simulation, the proposed OES architecture synthesized from HDL was prototyped on a Xilinx Virtex II Pro FPGA device supporting partial reconfiguration to demonstrate the feasibility for intrinsic regeneration of the selected circuit.
Identifier: CFE0002280 (IID), ucf:47849 (fedora)
Note(s): 2008-08-01
Ph.D.
Engineering and Computer Science, School of Electrical Engineering and Computer Science
Doctorate
This record was generated from author submitted information.
Subject(s): FPGA FAULT TOLERANCE RECONFIGURATION GENETIC ALGORITHMS
Persistent Link to This Record: http://purl.flvc.org/ucf/fd/CFE0002280
Restrictions on Access: public
Host Institution: UCF

In Collections