- Conference Paper
We introduce a new approach for finding and fixing naming issues in source code. The method is based on a careful combination of unsupervised and supervised procedures: (i) unsupervised mining of patterns from Big Code that express common naming idioms. Program fragments violating such idioms indicates likely naming issues, and (ii) supervised learning of a classifier on a small labeled dataset which filters potential false positives from the violations. We implemented our method in a system called Namer and evaluated it on a large number of Python and Java programs. We demonstrate that Namer is effective in finding naming mistakes in real world repositories with high precision (∼70%). Perhaps surprisingly, we also show that existing deep learning methods are not practically effective and achieve low precision in finding naming issues (up to ∼16%). Show more
Book titleProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI '21)
Pages / Article No.
PublisherAssociation for Computing Machinery
SubjectName-based program analysis; Static analysis; Bug detection; Anomaly detection; Machine learning
Organisational unit03948 - Vechev, Martin / Vechev, Martin
MoreShow all metadata