Predicting Hard Disk Failures in Data Centers Using Temporal Convolutional Neural Networks
METADATA ONLY
Loading...
Author / Producer
Date
2021
Publication Type
Conference Paper
ETH Bibliography
yes
Citations
Altmetric
METADATA ONLY
Data
Rights / License
Abstract
In modern data centers, storage system failures are major contributors to downtimes and maintenance costs. Predicting these failures by collecting measurements from disks and analyzing them with machine learning techniques can effectively reduce their impact, enabling timely maintenance. While there is a vast literature on this subject, most approaches attempt to predict hard disk failures using either classic machine learning solutions, such as Random Forests (RFs) or deep Recurrent Neural Networks (RNNs). In this work, we address hard disk failure prediction using Temporal Convolutional Networks (TCNs), a novel type of deep neural network for time series analysis. Using a real-world dataset, we show that TCNs outperform both RFs and RNNs. Specifically, we can improve the Fault Detection Rate (FDR) approximate to 7.5% of (FDR = 89.1%) compared to the state-of-the-art, while simultaneously reducing the False Alarm Rate (FAR = 0.052%). Moreover, we explore the network architecture design space showing that TCNs are consistently superior to RNNs for a given model size and complexity and that even relatively small TCNs can reach satisfactory performance. All the codes to reproduce the results presented in this paper are available at https://github.com/ABurrello/tcn-hard-disk-failure-prediction.
Permanent link
Publication status
published
External links
Book title
Euro-Par 2020: Parallel Processing Workshops
Journal / series
Volume
12480
Pages / Article No.
277 - 289
Publisher
Springer
Event
26th International Conference on Parallel and Distributed Computing (Euro-Par 2020)
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Predictive maintenance; IoT; Deep learning; Sequence analysis; Temporal Convolutional Networks
Organisational unit
03996 - Benini, Luca / Benini, Luca