An Adaptive and High Coding Rate Soft Error Correction Method in Network-on-Chips
Main Article Content
Abstract
The soft error rates per single-bit due to alpha particles in sub-micron technology is expectedly reduced
as the feature size is shrinking. On the other hand, the complexity and density of integrated systems are accelerating which demand ecient soft error protection mechanisms, especially for on-chip communication. Using soft error protection method has to satisfy tight requirements for the area and energy consumption, therefore a low complexity and low redundancy coding method is necessary. In this work, we propose a method to enhance Parity Product Code (PPC) and provide adaptation methods for this code. First, PPC is improved as forward error correcting using transposable retransmissions. Then, to adapt with dierent error rates, an augmented algorithm for configuring PPC is introduced. The evaluation results show that the proposed mechanism has coding rates similar to Parity check’s and outperforms the original PPC.
Keywords
Error Correction Code, Fault-Tolerance, Network-on-Chip.
References
[1] R. Baumann, Radiation-induced soft errors in advanced semiconductor technologies, IEEE
Transactions on Device and materials reliability. 5-3 (2005) 305–316. https://doi.org/10.1109/tdmr.2005.853449.
[2] N. Seifert, B. Gill, K. Foley, P. Relangi, Multi-cell upset probabilities of 45nm high-k + metal gate
SRAM devices in terrestrial and space environments, in: IEEE International Reliability Physics Symposium 2008, IEEE, AZ, USA, 2008, pp. 181–186.
[3] S. Lee, I. Kim, S. Ha, C.-s. Yu, J. Noh, S. Pae, J. Park, Radiation-induced soft error rate analyses for 14 nm
FinFET SRAM devices, in: 2015 IEEE International Reliability Physics Symposium (IRPS), IEEE, CA, USA, 2015, pp. 4B–1.
[4] R. Hamming, Error detecting and error correcting codes, Bell Labs Tech. J. 29-2 (1950) 147–160. https:
//www.doi.org/10.1002/j.1538-7305.1950.tb00463.x.
[5] M. Hsiao, A class of optimal minimum odd-weight-column SEC-DED codes, IBM
J. Res. Dev. 14-4 (1970) 395–401. https://www.doi.org/10.1147/rd.144.0395.
[6] S. Mittal, M. Inukonda, A survey of techniques for improving error-resilience of dram, Journal of
Systems Architecture. 91-1 (2018) 11–40. https://www.doi.org/10.1016/j.sysarc.2018.09.004.
[7] D. Bertozzi, et al., Error control schemes for on-chip communication links: the energy-reliability
tradeo, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 24-6 (2005) 818–831. https://doi.org/10.1109/tcad.2005. 847907.
[8] F. Chiaraluce, R. Garello, Extended Hamming product codes analytical performance evaluation for low error
rate applications, IEEE Transactions on Wireless Communications. 3-6 (2004) 2353–2361. https://doi. org/10.1109/twc.2004.837405.
[9] R. Pyndiah, Near-optimum decoding of product codes: Block turbo codes, IEEE Transactions on
Communications. 46-8 (1998) 1003–1010. https://www.doi.org/10.1109/26.705396.
[10] N. Magen, A. Kolodny, U. Weiser, N. Shamir, Interconnect-power dissipation in a microprocessor,
in: Proceedings of the 2004 international workshop on System level interconnect prediction, ACM, Paris,
France, 2004, pp. 7–13.
[11] K. Dang, X. Tran, Parity-based ECC and Mechanism for Detecting and Correcting Soft Errors in On-Chip
Communication, in: Proceeding of 2018 IEEE 11th International Symposium on Embedded
Multicore/Many-core Systems-on-Chip, IEEE, Hanoi, Vietnam, 2018, pp. 1–6.
[12] L. Saiz-Adalid, et al., MCU tolerance in SRAMs through low-redundancy triple adjacent error correction, IEEE Transactions on VLSI Systems. 23-10 (2015) 2332–2336. https://www.doi.org/10.1109/tvlsi.2014.2357476.
[13] W. Peterson, D. Brown, Cyclic codes for error detection, Proceedings of the IRE 49-1 (1961)
228–235. https://www.doi.org/10.1109/jrproc.1961.287814.
[14] S. Wicker, V. Bhargava, Reed-Solomon Codes and Their Applications, first ed., JohnWiley and Sons, NJ,
USA, 1999.
[15] I. Reed, X. Chen, Error-control coding for data networks, first ed., Springer Science and Business
Media, New York, 2012.
[16] L. Peterson, B. Davie, Computer networks: a systems approach, fifth ed., Elsevier, New York, 2011.
[17] K. Dang, et al., Soft-error resilient 3D Network-on-Chip router, in: 2015 IEEE 7th
International Conference on Awareness Science and Technology (iCAST), China, 2015, pp. 84–90.
[18] K. Dang, et al., A low-overhead soft–hard fault-tolerant architecture, design and management
scheme for reliable high-performance many-core 3D-NoC systems, The Journal of Supercomputing.
73-6 (2017) 2705–2729. https://www.doi.org/10.1007/s11227-016-1951-0.
[19] D. Ernst, et al., Razor: A low-power pipeline based on circuit-level timing speculation, in: The
36th annual IEEE/ACM International Symposium on Microarchitecture, IEEE, CA, USA, 2003, pp. 10–20.
[20] H. Mohammed, W. Flayyih, F. Rokhani, Tolerating permanent faults in the input port of the network on
chip router, Journal of Low Power Electronics and Applications. 9-1 (2019) 1–11. https://www.doi.org/10.3390/jlpea9010011.
[21] G. Hubert, L. Artola, D. Regis, Impact of scaling on the soft error sensitivity of bulk, FDSOI and FinFET
technologies due to atmospheric radiation, Integration, the VLSI journal. 50 (2015) 39–47. https://www.doi.
org/10.1016/j.vlsi.2015.01.003.
[22] J.-s. Seo, et al., A 45nm cmos neuromorphic chip with a scalable architecture for learning in networks of spiking neurons, in: 2011 IEEE Custom Integrated Circuits Conference (CICC), IEEE, CA, USA, 2011, pp. 1–4.
[23] NanGate Inc., Nangate Open Cell Library 45 nm. http://www.nangate.com, (accessed 16.06.16) (2016).