Improving Syntactical Clone Detection Methods through the Use of an Intermediate Representation
dc.contributor.author
Caldeira, Pedro M.
dc.contributor.author
Sakamoto, Kazunori
dc.contributor.author
Washizaki, Hironori
dc.contributor.author
Fukazawa, Yoshiaki
dc.contributor.author
Takahisa, Shimada
dc.date.accessioned
2020-09-22T06:57:05Z
dc.date.available
2020-02-03T16:13:38Z
dc.date.available
2020-02-04T09:19:12Z
dc.date.available
2020-02-13T13:41:46Z
dc.date.available
2020-02-21T11:24:46Z
dc.date.available
2020-05-13T15:04:32Z
dc.date.available
2020-09-22T06:57:05Z
dc.date.issued
2020
dc.identifier.isbn
978-1-7281-6269-0
en_US
dc.identifier.isbn
978-1-7281-6270-6
en_US
dc.identifier.other
10.1109/IWSC50091.2020.9047637
en_US
dc.identifier.uri
http://hdl.handle.net/20.500.11850/396812
dc.identifier.doi
10.3929/ethz-b-000396812
dc.description.abstract
Detection of type-3 and type-4 clones remains a difficult task. Current methods are complex, both on a conceptual and computational level. Similarly, their usage requires substantial implementation efforts. Instead of creating yet another method, it might be more productive to combine the simplicity of syntactic approaches with the abstractions granted by intermediate representations (IR). To this end, we devised a c-like IR based on LLVM and ran NiCad on it (LLNiCad). To establish whether the clone detection capabilities of syntactic approaches can be improved through an IR, we compared NiCad and LLNiCad on three open source projects taken from Krutz's benchmark and a subset of Google code jam solutions. In our results, the f1-score of LLNiCad consistently outperforms NiCad. Indeed, for all clone types in Krutz's benchmark, LLNiCad has a f1-score that is 37% higher than NiCad; with both better precision and recall. For type-4 clones in our GCJ benchmark, the f1-score of LLNiCad also outperforms CCCD (a semantic clone detector) by 44%. These findings suggest that IRs are beneficial for improving clone detection and that they have a larger impact on type-3 and type-4 clones.
en_US
dc.format
application/pdf
en_US
dc.language.iso
en
en_US
dc.publisher
IEEE
en_US
dc.rights.uri
http://rightsstatements.org/page/InC-NC/1.0/
dc.title
Improving Syntactical Clone Detection Methods through the Use of an Intermediate Representation
en_US
dc.type
Conference Paper
dc.rights.license
In Copyright - Non-Commercial Use Permitted
dc.date.published
2020-03-26
ethz.book.title
2020 IEEE 14th International Workshop on Software Clones (IWSC)
en_US
ethz.pages.start
8
en_US
ethz.pages.end
14
en_US
ethz.version.deposit
acceptedVersion
en_US
ethz.event
14th International Workshop on Software Clones (IWSC 2020)
en_US
ethz.event.location
London, Canada
ethz.event.date
February 18, 2020
en_US
ethz.notes
Conference lecture on February 18, 2020.
en_US
ethz.identifier.wos
ethz.identifier.scopus
ethz.publication.place
Piscataway, NJ
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02045 - Dep. Geistes-, Sozial- u. Staatswiss. / Dep. of Humanities, Social and Pol.Sc.::02527 - Institut für Verhaltenswissenschaften / Institute of Behavioral Sciences::09590 - Kapur, Manu / Kapur, Manu
en_US
ethz.leitzahl.certified
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02045 - Dep. Geistes-, Sozial- u. Staatswiss. / Dep. of Humanities, Social and Pol.Sc.::02527 - Institut für Verhaltenswissenschaften / Institute of Behavioral Sciences::09590 - Kapur, Manu / Kapur, Manu
en_US
ethz.date.deposited
2020-02-03T16:13:47Z
ethz.source
FORM
ethz.eth
yes
en_US
ethz.availability
Open access
en_US
ethz.rosetta.installDate
2020-02-04T09:19:22Z
ethz.rosetta.lastUpdated
2024-02-02T12:07:42Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Improving%20Syntactical%20Clone%20Detection%20Methods%20through%20the%20Use%20of%20an%20Intermediate%20Representation&rft.date=2020&rft.spage=8&rft.epage=14&rft.au=Caldeira,%20Pedro%20M.&Sakamoto,%20Kazunori&Washizaki,%20Hironori&Fukazawa,%20Yoshiaki&Takahisa,%20Shimada&rft.isbn=978-1-7281-6269-0&978-1-7281-6270-6&rft.genre=proceeding&rft_id=info:doi/10.1109/IWSC50091.2020.9047637&rft.btitle=2020%20IEEE%2014th%20International%20Workshop%20on%20Software%20Clones%20(IWSC)
Files in this item
Publication type
-
Conference Paper [36916]