Advertisement

An Indicator for the Number of Clusters: Using a Linear Map to Simplex Structure

  • Marcus Weber
  • Wasinee Rungsarityotin
  • Alexander Schliep
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Abstract

The problem of clustering data can be formulated as a graph partitioning problem. In this setting, spectral methods for obtaining optimal solutions have received a lot of attention recently. We describe Perron Cluster Cluster Analysis (PCCA) and establish a connection to spectral graph partitioning. We show that in our approach a clustering can be efficiently computed by mapping the eigenvector data onto a simplex. To deal with the prevalent problem of noisy and possibly overlapping data we introduce the Min-chi indicator which helps in confirming the existence of a partition of the data and in selecting the number of clusters with quite favorable performance. Furthermore, if no hard partition exists in the data, the Min-chi can guide in selecting the number of modes in a mixture model. We close with showing results on simulated data generated by a mixture of Gaussians.

Keywords

Mixture Model Spectral Cluster Simplex Algorithm Stochastic Matrix Hard Partition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. DAHLHAUS, E., JOHNSON, D. S., PAPADIMITRIOU, C. H., SEYMOUR, P. D. and M. YANNAKAKIS (1994): The complexity of multiterminal cuts. SIAM J. Comput., 23(4):864–894.CrossRefMathSciNetGoogle Scholar
  2. DEUFLHARD, P. and WEBER, M. (2005): Robust Perron Cluster Analysis in Conformation Dynamics. Lin. Alg. App., Special Issue on Matrices and Mathematical Biology, 398c:161–184.MathSciNetGoogle Scholar
  3. HASTIE, T., TIBSHIRANI, R. and FRIEDMAN, J. (2001): The Elements of Statistical Learning. Springer, Berlin.Google Scholar
  4. JAIN, A.K. and DUBES, R.C. (1988): Algorithms for clustering data. Prentice Hall, Engelwood Cliffs.Google Scholar
  5. KANNAN, R., VEMPALA, S. and VETTA, A. (1999): On Clusterings: Good, Bad and Spectral. Proceedings of IEEE Foundations of Computer Science.Google Scholar
  6. MCLACHLAN, G.J. and BASFORD, K.E. (1988): Mixture Models: Inference and Applications to Clustering. Marcel Dekker, Inc., New York, Basel.Google Scholar
  7. NG, A. Y., JORDAN, M. and WEISS, J (2002): On spectral clustering: Analysis and an algorithm. Advances in Neural Information Processing Systems 14.Google Scholar
  8. SHI, J. and MALIK, J. (2000): Normalized Cuts and Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888–905.Google Scholar
  9. WEBER, M. (2004): Clustering by using a simplex structure. Technical report, ZR-04-03, Zuse Institute Berlin.Google Scholar
  10. WEBER, M. and GALLIAT, T (2002): Characterization of transition states in conformational dynamics using Fuzzy sets. Technical Report 02-12, Zuse Institute Berlin (ZIB).Google Scholar

Copyright information

© Springer Berlin · Heidelberg 2006

Authors and Affiliations

  • Marcus Weber
    • 1
  • Wasinee Rungsarityotin
    • 2
  • Alexander Schliep
    • 2
  1. 1.Zuse Institute Berlin ZIBBerlinGermany
  2. 2.Computational Molecular BiologyMax Planck Institute for Molecular GeneticsBerlinGermany

Personalised recommendations