Unit Volume Based Distributed Clustering Using Probabilistic Mixture Model

  • Keunjoon Lee
  • Jinu Joo
  • Jihoon Yang
  • Sungyong Park
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3735)


Extracting useful knowledge from numerous distributed data repositories can be a very hard task when such data cannot be directly centralized or unified as a single file or database. This paper suggests practical distributed clustering algorithms without accessing the raw data to overcome the inefficiency of centralized data clustering methods. The aim of this research is to generate unit volume based probabilistic mixture model from local clustering results without moving original data. It has been shown that our method is appropriate for distributed clustering when real data cannot be accessed or centralized.


Cluster Algorithm Unit Volume Mixture Model Local Cluster Privacy Preserve 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. John Wiley and Sons Inc., Chichester (2000)Google Scholar
  2. 2.
    Januzaj, E., Kriegel, H.P., Pfeifle, M.: Towards effective and efficient distributed clustering. In: International Workshop on Clustering Large Data Set (ICDM) (2003)Google Scholar
  3. 3.
    Vrahatis, M.N., Boutsinas, B., Alevizos, P., Pavlides, G.: The new k-windows algorithm for improving the k-means clustering algorithm. Journal of Complexity 18, 375–391 (2002)CrossRefMathSciNetzbMATHGoogle Scholar
  4. 4.
    Tasoulis, D.K., Vrahatis, M.N.: Unsupervised distributed clustering. In: The IASTED International Conference on Parallel and Distributed Computing and Networks, as part of the Twenty-Second IASTED International Multi-Conference on Applied Informatics, Innsbruck, Austria (2004)Google Scholar
  5. 5.
    Merugu, S., Ghosh, J.: Privacy-preserving distributed clustering using generative models. In: The Third IEEE International Conference on Data Mining (ICDM 2003) (2003)Google Scholar
  6. 6.
    Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis, E., Han, J., Fayyad, U. (eds.) Second International Conference on Knowledge Discovery and Data Mining, Portland, Oregon, pp. 226–231. AAAI Press, Menlo Park (1996)Google Scholar
  7. 7.
    Trivedi, K.S.: Probability and statistics with reliability, queuing and computer science applications. John Wiley and Sons Inc., Chichester (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Keunjoon Lee
    • 1
  • Jinu Joo
    • 2
  • Jihoon Yang
    • 2
  • Sungyong Park
    • 2
  1. 1.Kookmin BankSeoulKorea
  2. 2.Department of Computer Science and Interdisciplinary Program of Integrated BiotechnologySogang UniversitySeoulKorea

Personalised recommendations