dc.contributor.author | Møllersen, Kajsa | |
dc.contributor.author | Dhar, Subhra | |
dc.contributor.author | Godtliebsen, Fred | |
dc.date.accessioned | 2017-03-17T13:35:26Z | |
dc.date.available | 2017-03-17T13:35:26Z | |
dc.date.issued | 2016-09-12 | |
dc.description.abstract | Hybrid clustering combines partitional and hierarchical clustering for computational
effectiveness and versatility in cluster shape. In such clustering, a dissimilarity measure
plays a crucial role in the hierarchical merging. The dissimilarity measure has
great impact on the final clustering, and data-independent properties are needed to
choose the right dissimilarity measure for the problem at hand. Properties for distance-
based dissimilarity measures have been studied for decades, but properties for
density-based dissimilarity measures have so far received little attention. Here, we
propose six data-independent properties to evaluate density-based dissimilarity measures
associated with hybrid clustering, regarding equality, orthogonality, symmetry,
outlier and noise observations, and light-tailed models for heavy-tailed clusters. The
significance of the properties is investigated, and we study some well-known dissimilarity
measures based on Shannon entropy, misclassification rate, Bhattacharyya
distance and Kullback-Leibler divergence with respect to the proposed properties. As
none of them satisfy all the proposed properties, we introduce a new dissimilarity
measure based on the Kullback-Leibler information and show that it satisfies all proposed
properties. The effect of the proposed properties is also illustrated on several
real and simulated data sets. | en_US |
dc.description | Source: <a href=http://dx.doi.org/10.4236/am.2016.715143>doi: 10.4236/am.2016.715143</a> | en_US |
dc.identifier.citation | Møllersen, K., Dhar, S.S. and Godtliebsen, F. (2016) On Data-Independent Properties for Density-Based Dissimilarity Measures in Hybrid Clustering. Applied Mathematics , 7, 1674-1706. http://dx.doi.org/10.4236/am.2016.715143 | en_US |
dc.identifier.cristinID | FRIDAID 1438272 | |
dc.identifier.doi | 10.4236/am.2016.715143 | |
dc.identifier.issn | 2152-7385 | |
dc.identifier.issn | 2152-7393 | |
dc.identifier.uri | https://hdl.handle.net/10037/10769 | |
dc.language.iso | eng | en_US |
dc.publisher | Scientific Research Publishing | en_US |
dc.relation.journal | Applied Mathematics | |
dc.rights.accessRights | openAccess | en_US |
dc.subject | VDP::Matematikk og Naturvitenskap: 400::Matematikk: 410 | en_US |
dc.subject | VDP::Mathematics and natural science: 400::Mathematics: 410 | en_US |
dc.title | On Data-Independent Properties for Density-Based Dissimilarity Measures in Hybrid Clustering | en_US |
dc.type | Journal article | en_US |
dc.type | Tidsskriftartikkel | en_US |
dc.type | Peer reviewed | en_US |