Metadata only
Date
2021-06Type
- Conference Paper
Abstract
We introduce a new approach for finding and fixing naming issues in source code. The method is based on a careful combination of unsupervised and supervised procedures: (i) unsupervised mining of patterns from Big Code that express common naming idioms. Program fragments violating such idioms indicates likely naming issues, and (ii) supervised learning of a classifier on a small labeled dataset which filters potential false positives from the violations. We implemented our method in a system called Namer and evaluated it on a large number of Python and Java programs. We demonstrate that Namer is effective in finding naming mistakes in real world repositories with high precision (∼70%). Perhaps surprisingly, we also show that existing deep learning methods are not practically effective and achieve low precision in finding naming issues (up to ∼16%). Show more
Publication status
publishedExternal links
Book title
Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI '21)Pages / Article No.
Publisher
Association for Computing MachineryEvent
Subject
Name-based program analysis; Static analysis; Bug detection; Anomaly detection; Machine learningOrganisational unit
03948 - Vechev, Martin / Vechev, Martin
More
Show all metadata