Show simple item record

dc.contributor.author
d’Errico, Maria
dc.contributor.author
Facco, Elena
dc.contributor.author
Laio, Alessandro
dc.contributor.author
Rodriguez, Alex
dc.date.accessioned
2021-03-16T21:36:31Z
dc.date.available
2021-03-14T05:25:54Z
dc.date.available
2021-03-16T21:36:31Z
dc.date.issued
2021-06
dc.identifier.issn
0020-0255
dc.identifier.issn
1872-6291
dc.identifier.other
10.1016/j.ins.2021.01.010
en_US
dc.identifier.uri
http://hdl.handle.net/20.500.11850/474338
dc.description.abstract
Data analysis in high-dimensional spaces aims at obtaining a synthetic description of a data set, revealing its main structure and its salient features. We here introduce an approach providing this description in the form of a topography of the data, namely a human-readable chart of the probability density from which the data are harvested. The approach is based on an unsupervised extension of Density Peak clustering and on a non-parametric density estimator that measures the probability density in the manifold containing the data. This allows finding automatically the number and the height of the peaks of the probability density, and the depth of the “valleys” separating them. Importantly, the density estimator provides a measure of the error, which allows distinguishing genuine density peaks from density fluctuations due to finite sampling. The approach thus provides robust and visual information about the density peaks height, their statistical reliability and their hierarchical organization, offering a conceptually powerful extension of the standard clustering partitions. We show that this framework is particularly useful in the analysis of complex data sets. © 2021 Elsevier
en_US
dc.language.iso
en
en_US
dc.publisher
Elsevier
en_US
dc.subject
Clustering-algorithm
en_US
dc.subject
High-dimensional-data
en_US
dc.subject
Hierarchy-visualization
en_US
dc.subject
Density-peak-clustering
en_US
dc.subject
Non-parametric-density-estimation
en_US
dc.title
Automatic topography of high-dimensional data sets by non-parametric density peak clustering
en_US
dc.type
Journal Article
dc.date.published
2021-01-26
ethz.journal.title
Information Sciences
ethz.journal.volume
560
en_US
ethz.journal.abbreviated
Inf. sci. (Print)
ethz.pages.start
476
en_US
ethz.pages.end
492
en_US
ethz.identifier.scopus
ethz.publication.place
Oxford
en_US
ethz.publication.status
published
en_US
ethz.date.deposited
2021-03-14T05:26:03Z
ethz.source
SCOPUS
ethz.eth
yes
en_US
ethz.availability
Metadata only
en_US
ethz.rosetta.installDate
2021-03-16T21:36:53Z
ethz.rosetta.lastUpdated
2021-03-16T21:36:53Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Automatic%20topography%20of%20high-dimensional%20data%20sets%20by%20non-parametric%20density%20peak%20clustering&rft.jtitle=Information%20Sciences&rft.date=2021-06&rft.volume=560&rft.spage=476&rft.epage=492&rft.issn=0020-0255&1872-6291&rft.au=d%E2%80%99Errico,%20Maria&Facco,%20Elena&Laio,%20Alessandro&Rodriguez,%20Alex&rft.genre=article&rft_id=info:doi/10.1016/j.ins.2021.01.010&
 Search print copy at ETH Library

Files in this item

FilesSizeFormatOpen in viewer

There are no files associated with this item.

Publication type

Show simple item record