When Your Infrastructure Is a Buggy Program: Understanding Faults in Infrastructure as Code Ecosystems
Abstract
Modern applications have become increasingly complex and their manual installation and configuration is no longer practical. Instead, IT organizations heavily rely on Infrastructure as Code (IaC) technologies, to automate the provisioning, configuration, and maintenance of computing infrastructures and systems. IaC systems typically offer declarative, domain-specific languages (DSLs) that allow system administrators and developers to write high-level programs that specify the desired state of their infrastructure in a reliable, predictable, and documented fashion. Just like traditional programs, IaC software is not immune to faults, with issues ranging from deployment failures to critical misconfigurations that often impact production systems used by millions of end users. Surprisingly, despite its crucial role in global infrastructure management, the tooling and techniques for ensuring IaC reliability still have room for improvement. In this work, we conduct a comprehensive analysis of 360 bugs identified in IaC software within prominent IaC ecosystems including Ansible, Puppet, and Chef. Our work is the first in-depth exploration of bug characteristics in these widely-used IaC environments. Through our analysis we aim to understand: (1) how these bugs manifest, (2) their underlying root causes, (3) their reproduction requirements in terms of system state (e.g., operating system versions) or input characteristics, and (4) how these bugs are fixed. Based on our findings, we evaluate the state-of-the-art techniques for IaC reliability, identify their limitations, and provide a set of recommendations for future research. We believe that our study helps researchers to (1) better understand the complexity and peculiarities of IaC software, and (2) develop advanced tooling for more reliable and robust system configurations. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000702587Publication status
publishedExternal links
Journal / series
Proceedings of the ACM on Programming LanguagesVolume
Pages / Article No.
Publisher
Association for Computing MachinerySubject
IaC; infrastructure as code; bug; Puppet; Ansible; Chef; testing; deploymentMore
Show all metadata
ETH Bibliography
yes
Altmetrics