Show simple item record

dc.contributor.author
Cheng, Daning
dc.contributor.author
Li, Shigang
dc.contributor.author
Zhang, H.
dc.contributor.author
Xia, Fen
dc.contributor.author
Zhang, Yunquan
dc.date.accessioned
2021-03-09T09:20:38Z
dc.date.available
2021-03-09T06:04:19Z
dc.date.available
2021-03-09T09:20:38Z
dc.date.issued
2021-07
dc.identifier.issn
1045-9219
dc.identifier.issn
1558-2183
dc.identifier.issn
2161-9883
dc.identifier.other
10.1109/TPDS.2020.3048836
en_US
dc.identifier.uri
http://hdl.handle.net/20.500.11850/473501
dc.description.abstract
As the training dataset size and the model size of machine learning increase rapidly, more computing resources are consumed to speedup the training process. However, the scalability and performance reproducibility of parallel machine learning training, which mainly uses stochastic optimization algorithms, are limited. In this paper, we demonstrate that the sample difference in the dataset plays a prominent role in the scalability of parallel machine learning algorithms. We propose to use statistical properties of dataset to measure sample differences. These properties include the variance of sample features, sample sparsity, sample diversity, and similarity in sampling sequences. We choose four types of parallel training algorithms as our research objects: (1) the asynchronous parallel SGD algorithm (Hogwild! algorithm), (2) the parallel model average SGD algorithm (minibatch SGD algorithm), (3) the decentralization optimization algorithm, and (4) the dual coordinate optimization (DADM algorithm). Our results show that the statistical properties of training datasets determine the scalability upper bound of these parallel training algorithms. © 1990-2012 IEEE.
en_US
dc.language.iso
en
en_US
dc.publisher
Institute of Electrical and Electronics Engineers
en_US
dc.subject
Parallel training algorithms
en_US
dc.subject
training dataset
en_US
dc.subject
scalability
en_US
dc.subject
stochastic optimization methods
en_US
dc.title
Why Dataset Properties Bound the Scalability of Parallel Machine Learning Training Algorithms
en_US
dc.type
Journal Article
dc.date.published
2021-01-06
ethz.journal.title
IEEE Transactions on Parallel and Distributed Systems
ethz.journal.volume
32
en_US
ethz.journal.issue
7
en_US
ethz.journal.abbreviated
IEEE Trans. Parallel Distrib. Syst.
ethz.pages.start
1702
en_US
ethz.pages.end
1712
en_US
ethz.identifier.scopus
ethz.publication.place
New York, NY
en_US
ethz.publication.status
published
en_US
ethz.date.deposited
2021-03-09T06:04:40Z
ethz.source
SCOPUS
ethz.eth
yes
en_US
ethz.availability
Metadata only
en_US
ethz.rosetta.installDate
2021-03-09T09:20:49Z
ethz.rosetta.lastUpdated
2021-03-09T09:20:49Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Why%20Dataset%20Properties%20Bound%20the%20Scalability%20of%20Parallel%20Machine%20Learning%20Training%20Algorithms&rft.jtitle=IEEE%20Transactions%20on%20Parallel%20and%20Distributed%20Systems&rft.date=2021-07&rft.volume=32&rft.issue=7&rft.spage=1702&rft.epage=1712&rft.issn=1045-9219&1558-2183&2161-9883&rft.au=Cheng,%20Daning&Li,%20Shigang&Zhang,%20H.&Xia,%20Fen&Zhang,%20Yunquan&rft.genre=article&rft_id=info:doi/10.1109/TPDS.2020.3048836&
 Search print copy at ETH Library

Files in this item

FilesSizeFormatOpen in viewer

There are no files associated with this item.

Publication type

Show simple item record