Bit-exact ECC recovery (BEER): Determining dram on-die ECC functions by exploiting dram data retention characteristics
Metadata only
Date
2020Type
- Conference Paper
Abstract
Increasing single-cell DRAM error rates have pushed DRAM manufacturers to adopt on-die error-correction coding (ECC), which operates entirely within a DRAM chip to improve factory yield. The on-die ECC function and its effects on DRAM reliability are considered trade secrets, so only the manufacturer knows precisely how on-die ECC alters the externally-visible reliability characteristics. Consequently, on-die ECC obstructs third-party DRAM customers (e.g., test engineers, experimental researchers), who typically design, test, and validate systems based on these characteristicsTo give third parties insight into precisely how on-die ECC transforms DRAM error patterns during error correction, we introduce Bit-Exact ECC Recovery (BEER), a new methodology for determining the full DRAM on-die ECC function (i.e., its parity-check matrix) without hardware tools, prerequisite knowledge about the DRAM chip or on-die ECC mechanism, or access to ECC metadata (e.g., error syndromes, parity information). BEER exploits the key insight that non-intrusively inducing data-retention errors with carefully-crafted test pat-terns reveals behavior that is unique to a specific ECC functionWe use BEER to identify the ECC functions of 80 real LPDDR4 DRAM chips with on-die ECC from three major DRAM manufacturers. We evaluate BEER’s correctness in simulation and performance on a real system to show that BEER is effective and practical across a wide range of on-die ECC functions. To demonstrate BEER’s value, we propose and discuss several ways that third parties can use BEER to improve their design and testing practices. As a concrete example, we introduce and evaluate BEEP, the first error profiling method-ology that uses the known on-die ECC function to recover the number and bit-exact locations of unobservable raw bit errors responsible for observable post-correction errors. © 2020 IEEE Show more
Publication status
publishedExternal links
Book title
2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)Pages / Article No.
Publisher
IEEEEvent
Organisational unit
09483 - Mutlu, Onur / Mutlu, Onur
Related publications and datasets
Is referenced by: https://doi.org/10.3929/ethz-b-000542542
Notes
Due to the Coronavirus (COVID-19) the conference was conducted virtually.More
Show all metadata