Open access
Date
2022Type
- Conference Paper
ETH Bibliography
yes
Altmetrics
Abstract
We give tight statistical query (SQ) lower bounds for learnining halfspaces in the presence of Massart noise. In particular, suppose that all labels are corrupted with probability at most \eta. We show that for arbitrary \eta in [0,1/2] every SQ algorithm achieving misclassification error better than \eta requires queries of superpolynomial accuracy or at least a superpolynomial number of queries. Further, this continues to hold even if the information-theoretically optimal error OPT is as small as exp(−log^c(d)), where d is the dimension and 0 < c < 1 is an arbitrary absolute constant, and an overwhelming fraction of examples are noiseless. Our lower bound matches known polynomial time algorithms, which are also implementable in the SQ framework. Previously, such lower bounds only ruled out algorithms achieving error OPT+\epsilon or error better than \Omega(\eta) or, if \eta is close to 1/2, error \eta- o(1), where the term o(1)is constant in d but going to 0 for \eta approaching 1/2. As a consequence, we also show that achieving misclassification error better than 1/2 in the (A,\alpha)-Tsybakov model is SQ-hard for A constant and \alpha bounded away from 1. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000595461Publication status
publishedExternal links
Book title
Proceedings of Thirty Fifth Conference on Learning TheoryJournal / series
Proceedings of Machine Learning ResearchVolume
Pages / Article No.
Publisher
PMLREvent
Subject
Massart Noise; PAC learning; Statistical query lower boundsOrganisational unit
09622 - Steurer, David / Steurer, David
Funding
815464 - Unified Theory of Efficient Optimization and Estimation (EC)
More
Show all metadata
ETH Bibliography
yes
Altmetrics