Show simple item record

dc.contributor.author
Gupta, Shashij
dc.contributor.author
He, Pinjia
dc.contributor.author
Meister, Clara
dc.contributor.author
Su, Zhendong
dc.date.accessioned
2020-12-16T08:15:07Z
dc.date.available
2020-12-11T03:54:54Z
dc.date.available
2020-12-16T08:07:35Z
dc.date.available
2020-12-16T08:15:07Z
dc.date.issued
2020-11
dc.identifier.isbn
978-1-4503-7043-1
en_US
dc.identifier.other
10.1145/3368089.3409756
en_US
dc.identifier.uri
http://hdl.handle.net/20.500.11850/455891
dc.description.abstract
Machine translation software has become heavily integrated into our daily lives due to the recent improvement in the performance of deep neural networks. However, machine translation software has been shown to regularly return erroneous translations, which can lead to harmful consequences such as economic loss and political conflicts. Additionally, due to the complexity of the underlying neural models, testing machine translation systems presents new challenges. To address this problem, we introduce a novel methodology called PatInv. The main intuition behind PatInv is that sentences with different meanings should not have the same translation. Under this general idea, we provide two realizations of PatInv that given an arbitrary sentence, generate syntactically similar but semantically different sentences by: (1) replacing one word in the sentence using a masked language model or (2) removing one word or phrase from the sentence based on its constituency structure. We then test whether the returned translations are the same for the original and modified sentences. We have applied PatInv to test Google Translate and Bing Microsoft Translator using 200 English sentences. Two language settings are considered: English-Hindi (En-Hi) and English-Chinese (En-Zh). The results show that PatInv can accurately find 308 erroneous translations in Google Translate and 223 erroneous translations in Bing Microsoft Translator, most of which cannot be found by the state-of-the-art approaches. © 2020 ACM
en_US
dc.language.iso
en
en_US
dc.publisher
ACM
en_US
dc.subject
Testing
en_US
dc.subject
Machine translation
en_US
dc.subject
Pathological Invariance
en_US
dc.title
Machine translation testing via pathological invariance
en_US
dc.type
Conference Paper
dc.date.published
2020-11-08
ethz.book.title
Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
en_US
ethz.pages.start
863
en_US
ethz.pages.end
875
en_US
ethz.event
28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2020) (virtual)
en_US
ethz.event.location
Sacramento, CA, USA
en_US
ethz.event.date
November 8-13, 2020
en_US
ethz.notes
Due to the Coronavirus (COVID-19) the conference was conducted virtually.
en_US
ethz.identifier.scopus
ethz.publication.place
New York, NY
en_US
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02664 - Inst. f. Programmiersprachen u. -systeme / Inst. Programming Languages and Systems::09628 - Su, Zhendong / Su, Zhendong
ethz.leitzahl.certified
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02664 - Inst. f. Programmiersprachen u. -systeme / Inst. Programming Languages and Systems::09628 - Su, Zhendong / Su, Zhendong
ethz.date.deposited
2020-12-11T03:55:04Z
ethz.source
SCOPUS
ethz.eth
yes
en_US
ethz.availability
Metadata only
en_US
ethz.rosetta.installDate
2020-12-16T08:07:49Z
ethz.rosetta.lastUpdated
2021-02-15T22:33:05Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Machine%20translation%20testing%20via%20pathological%20invariance&rft.date=2020-11&rft.spage=863&rft.epage=875&rft.au=Gupta,%20Shashij&He,%20Pinjia&Meister,%20Clara&Su,%20Zhendong&rft.isbn=978-1-4503-7043-1&rft.genre=proceeding&rft_id=info:doi/10.1145/3368089.3409756&rft.btitle=Proceedings%20of%20the%2028th%20ACM%20Joint%20Meeting%20on%20European%20Software%20Engineering%20Conference%20and%20Symposium%20on%20the%20Foundations%20of%20Software%20
 Search print copy at ETH Library

Files in this item

FilesSizeFormatOpen in viewer

There are no files associated with this item.

Publication type

Show simple item record