Frequent Items Mining on Data Streams using Matrix and Scan Reduced Indexing Algorithms

S. Vijayarani, C. Sivamathi, R. Prassanalakshmi

Abstract


A data stream is used for handling dynamic databases, in which data can arrive continuously without limit. Association rule mining is a data mining technique, used to find the association between the data items in the databases. To generate association rules, frequent items are to be identified from the transactional database. Normally, in data mining, frequent-item-generation algorithms scan the database multiple times. But this is impossible in data streams because it handles dynamic databases. Hence, there is a need to develop a new algorithm, which reduces the number of database scans. In this work, two new algorithms named Scan-Reduced Indexing and Matrix algorithm are proposed for generating frequent itemsets in data streams. Performances of both algorithms are compared based on the execution time and the number of frequent items generated. Experimental results show that the performance of the Scan-Reduced Indexing algorithm is more efficient than that of the Matrix algorithm.

Keywords


Association rules; Data streams; Database scans; Frequent items; Matrix algorithm; Scan-reduced indexing algorithm

Full Text:

PDF

References


Agrawal, R., and Srikant, R. (1994, September). Fast algorithms for mining association rules. Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, 1215, 487-499.

Anand, S. S., Patrick, A. R., Hughes, J. G., and Bell, D. A. (1998). A data mining methodology for cross-sales. Knowledge-Based Systems, 10(7), 449-461.

Basu, A. (1998). Perspectives on operations research in data and knowledge management. European Journal of Operational Research, 111(1), 1-14.

Chan, C. C. (1998). A rough set approach to attribute generalization in data mining. Information Sciences, 107(1-4), 169-176.

Goulbourne, G., Coenen, F., and Leng, P. (2000). Algorithms for computing association rules using a partial- support tree. Knowledge-BasedSystems, 13(2-3), 141-149.

Griffin, G., and Chen, Z. (1998). Rough set extension of Tcl for data mining. Knowledge-Based Systems, 11(3-4), 249-253.

Ha, S. H., and Park, S. C. (1998). Application of data mining tools to hotel data mart on the Intranet for database marketing. Expert Systems with Applications, 15(1), 1-31.

Han, J., and Fu, Y. (1999). Mining multiple-level association rules in large databases. IEEE Transactions on Knowledge and Data Engineering, 11(5), 798-805.

Han, J., Pei, J., Yin, Y., and Mao, R. (2004). Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery, 8(1), 53-87.

Kaski, S., Honkela, T., Lagus, K., and Kohonen, T. (1998). WEBSOM–self-organizing maps of document collections. Neurocomputing, 21(1-3), 101-117.

Mittal, A., Nagar, A., Gupta, K., and Nahar, R. (2015). Comparative study of various Frequent Pattern Mining algorithms. International Journal of Advanced Research in Computer and Communication Engineering, 4(4), 550-553.




DOI: https://doi.org/10.17509/ajse.v3i2.45345

Refbacks

  • There are currently no refbacks.


Copyright (c) 2022 Universitas Pendidikan Indonesia

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

ASEAN Journal of Science and Engineering (AJSE) is published by UPI 

View My Stats