Embedded Systems

Testing reliability techniques for SoCs with fault tolerant CGRA by using live FPGA fault injection

by Johannes M. Kühn, Thomas Schweizer, Dustin Peterson, Thommy Kuhn, and Wolfgang Rosenstiel
In 2013 International Conference on Field-Programmable Technology (FPT) (): 462-465, 2013.

Keywords: fault tolerance, field programmable gate arrays, integrated circuit reliability, logic design, reconfigurable architectures, redundancy, system-on-chip, field programmable gate array, dynamic functional verification, dynamic remapping, TMR technique, triple modular redundancy technique, SoC design, system on chip design, coarse grained reconfigurable architectures, fault injection method, live FPGA, fault tolerant CGRA, testing reliability technique, Tunneling magnetoresistance, Reliability, System-on-chip, Circuit faults, Context, Computer architecture, Field programmable gate arrays

Abstract

In this work, we intend to demonstrate a number of reliability techniques developed for Coarse Grained Reconfigurable Architectures (CGRA). The techniques to be demonstrated target different portions of a System on Chip (SoC) Design consisting of a general purpose CPU, various accelerators and a CGRA which may be used for application acceleration as well. On the CGRA we will demonstrate a light-weight Triple Modular Redundancy (TMR) technique which mitigates the hardware overhead usually incurred by TMR. In case of a detected CGRA fault, we use Dynamic Remapping of the application to avoid faulty components and thus restore the functionality of the mapped application. On SoC level, we demonstrate Dynamic Functional Verification to sample and thus detect faults in components of the SoC in a time multiplexed manner. The complete system is emulated on a Field Programmable Gate Array (FPGA) for which we developed a fast and accurate fault injection method to test the developed techniques in a live and realistic way.