Advertisement

Mining Frequent δ-Free Patterns in Large Databases

  • Céline Hébert
  • Bruno Crémilleux
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3735)

Abstract

Mining patterns under constraints in large data (also called fat data) is an important task to benefit from the multiple uses of the patterns embedded in these data sets. It is a difficult task due to the exponential growth of the search space according to the number of attributes. From such contexts, closed patterns can be extracted by using the properties of the Galois connections. But, from the best of our knowledge, there is no approach to extract interesting patterns like δ-free patterns which are on the core of a lot of relevant rules. In this paper, we propose a new method based on an efficient way to compute the extension of a pattern and a pruning criterion to mine frequent δ-free patterns in large databases. We give an algorithm (FTminer) for the practical use of this method. We show the efficiency of this approach by means of experiments on benchmarks and on gene expression data.

Keywords

Large databases δ-free patterns extensions rules condensed representations 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Mannila, H., Srikant, R., Toivonen, H.: Fast discovery of associations rules. In: Advances in Knowledge Discovery and Data Mining (1996)Google Scholar
  2. 2.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Bocca, J.B., Jarke, M., Zaniolo, C. (eds.) Proc. 20th Int. Conf. Very Large Data Bases, VLDB, pp. 487–499. Morgan Kaufmann, San Francisco (1994)Google Scholar
  3. 3.
    Bayardo, R.J.: The hows, whys, and whens of constraints in itemset and rule discovery. In: Proceedings of the workshop on Inductive Databases and Constraint Based Mining (2005)Google Scholar
  4. 4.
    Boulicaut, J.-F., Bykowski, A.: Frequent closures as a concise representation for binary data mining. In: Terano, T., Chen, A.L.P. (eds.) PAKDD 2000. LNCS, vol. 1805, pp. 62–73. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  5. 5.
    Boulicaut, J.-F., Bykowski, A., Rigotti, C.: Approximation of frequency queries by means of free-sets. In: The Fourth European Conference on Principles and Practice of Knowledge Discovery in Databases, Lyon (2000)Google Scholar
  6. 6.
    Boulicaut, J.F., Bykowski, A., Rigotti, C.: Free-sets: a condensed representation of boolean data for the approximation of frequency queries. Data Mining and Knowledge Discovery journal 7(1), 5–22 (2003)CrossRefMathSciNetGoogle Scholar
  7. 7.
    Crémilleux, B., Boulicaut, J.-F.: Simplest rules characterizing classes generated by delta-free sets. In: 22nd Int. Conf. on Knowledge Based Systems and Applied Artificial Intelligence (ES 2002), Cambridge, UK, December 2002, pp. 33–46 (2002)Google Scholar
  8. 8.
    Eiter, T., Gottlob, G.: Identifying the minimal transversals of a hypergraph and related problems. SIAM Journal on Computing archive 24(6), 1278–1304 (1995)CrossRefMathSciNetzbMATHGoogle Scholar
  9. 9.
    Jamison, R.E., Pfaltz, J.L.: Closure spaces that are not uniquely generated. In: Ordinal and Symbolic Data Analysis, Brussels (2000)Google Scholar
  10. 10.
    Jeudy, B., Rioult, F.: Database transposition for constrained closed pattern mining. In: Third International Workshop on Knowledge Discovery in Inductive Databases (2004)Google Scholar
  11. 11.
    Mannila, H., Toivonen, H.: Multiple uses of frequent sets and condensed representations. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD 1996), Portland, Oregon, pp. 189–194 (1996)Google Scholar
  12. 12.
    Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 1(3), 241–258 (1997)CrossRefGoogle Scholar
  13. 13.
    Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. Lecture Notes in Computer Science (1999)Google Scholar
  14. 14.
    Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Data Mining and Knowledge Discovery journal 24(1), 25–46 (1999)Google Scholar
  15. 15.
    Pei, J., Han, J., Mao, R.: CLOSET: An efficient algorithm for mining frequent closed itemsets. In: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 21–30 (2000)Google Scholar
  16. 16.
    Pfaltz, J.L., Taylor, C.: Closed set mining of biological data. In: Workshop on Data Mining and Bioinformatics (2002)Google Scholar
  17. 17.
    Rioult, F., Boulicaut, J.-F., Crémilleux, B., Besson, J.: Using transposition for pattern discovery from microarray data. In: Zaki, M.J., Aggarwal, C.C. (eds.) Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 2003), San Diego, CA, pp. 73–79 (2003)Google Scholar
  18. 18.
    Zaki, M.J.: Generating non-redundant association rules. In: Proceedings of the 6th International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD 2000), Boston, MA, pp. 34–43 (2000)Google Scholar
  19. 19.
    Zaki, M.J., Hsiao, C.J.: CHARM: an efficient algorithm for closed itemset mining. In: Proceedings of the 2th SIAM international conference on Data Mining, Arlington, pp. 33–43 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Céline Hébert
    • 1
  • Bruno Crémilleux
    • 1
  1. 1.GREYC, CNRS – UMR 6072Université de CaenCaenFrance

Personalised recommendations