MaxDomino: Efficiently Mining Maximal Sets
- 148 Downloads
We present MaxDomino, an algorithm for mining maximal frequent sets using a novel concept of dominancy factor of a transaction. We also propose a hashing scheme to collapse the database to a form that contains only unique transactions. Unlike traditional bottom up approach with look-aheads, MaxDomino employs a top down strategy with selective bottom up search for mining maximal sets. Using the connect dataset [Benchmark dataset created by University California, Irvine], our experimental results reveal that MaxDomino outperforms GenMax at higher support levels. Furthermore, our scalability tests show that MaxDomino yields an order of magnitude improvement in speed over GenMax. MaxDomino is especially efficient when the maximal frequent sets are longer.
KeywordsAssociation Rule Dominancy Factor Hash Table Candidate Subset Hash Tree
Unable to display preview. Download preview PDF.
- 1.Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In proceedings of ACM SIGMOD Conference on Management of Data. (1993) 207–216. Washigton D.C.Google Scholar
- 2.Agrawal R. et al: The Quest Data Mining System, Technical report, IBM Almaden Research Center. (1996b) Retrieved October 10, 2002 from http://www.almaden.ibm.com/cs/quest/.Google Scholar
- 3.Agrawal, R., Aggarwal C., Prasad VVV.: Depth first generation of Long patterns. 7th International conference on Knowledge discovery and Data mining (2000)Google Scholar
- 4.Burdick, D., Calimlim, M., Gehrke, J.: MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases. In Intl. Conf. On Data Engineering (2001)Google Scholar
- 5.Zaki, M.J., Gouda, K.: Fast vertical mining using diffsets. RPI Technical Report. 01-1. (2001)Google Scholar