Metadata only
Date
2023-07Type
- Conference Paper
ETH Bibliography
yes
Altmetrics
Abstract
We study the problem of learning comparisons between numbers with neural networks. Despite comparisons being a seemingly simple problem, we find that both general-purpose models such as multilayer perceptrons (MLPs) as well as arithmetic architectures such as the Neural Arithmetic Logic Unit (NALU) struggle with learning comparisons. Neither architecture can extrapolate to much larger numbers than those seen in the training set. We propose a novel differentiable architecture, the Neural Status Register (NSR) to solve this problem. We experimentally validate the NSR in various settings. We can combine the NSR with other neural models to solve interesting problems such as piecewise-defined arithmetic, comparison of digit images, recurrent problems, or finding shortest paths in graphs. The NSR outperforms all baseline architectures, especially when it comes to extrapolating to larger numbers. Show more
Publication status
publishedExternal links
Editor
Book title
Proceedings of the 40th International Conference on Machine LearningJournal / series
Proceedings of Machine Learning ResearchVolume
Pages / Article No.
Publisher
PMLREvent
Subject
Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML); FOS: Computer and information sciencesOrganisational unit
03604 - Wattenhofer, Roger / Wattenhofer, Roger
Related publications and datasets
Is new version of: https://doi.org/10.48550/arXiv.2004.07085
More
Show all metadata
ETH Bibliography
yes
Altmetrics