Frequent item set mining using normalized FP-growth algorithm

  • Authors

    • N K. Manikandan
    • D Manivannan
    2018-02-05
    https://doi.org/10.14419/ijet.v7i1.7.9573
  • Item Setmining, FP Growth Algorithm, Association Rule Mining.
  • Abstract

    As the volume of data and its storage schemes are increasing drastically, the knowledge discovery from these huge volume of heterogeneous and high dimension data emerges as an essential process. Number of algorithms for data association analysis has been introduced considering time and main memory requirements. However this algorithms get completed when the items and records grows extremely high. In this paper we have created a data structure that can be reused without modifying the schema. So the aim of this work is to make an efficient association rule mining independent of the algorithm being selected.

    Our data structure make data access much faster by simplifying and reorganizing them by implementing shuffling strategy using hamming distance and inverted index mapping. In this work we explore our algorithm’s efficiency by using many datasets containing millions of records in it. We increased the runtime orders of the magnitude and reduced the auxiliary and main memory requirements.

     

  • References

    1. [1] R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules in Large Databases,†Proc 20th Int’l Conf. Very Large Data Bases (VLDB), pp. 487-499, 1994.

      [2] R. Agrawal and R. Srikant, “Privacy-Preserving Data Mining,†Proc. ACM SIGMOD Conf., pp. 439-450, 2000. https://doi.org/10.1145/342009.335438.

      [3] D. Beaver, S. Micali, and P. Rogaway, “The Round Complexity of Secure Protocols,†Proc. 22nd Ann. ACM Symp. Theory of Computing (STOC), pp. 503-513, 1990. https://doi.org/10.1145/100216.100287.

      [4] M. Bellare, R. Canetti, and H. Krawczyk, “Keying Hash Functions for Message Authentication,†Proc. 16th Ann. Int’l Cryptology Conf. Advances in Cryptology (Crypto), pp. 1-15, 1996. https://doi.org/10.1007/3-540-68697-5_1.

      [5] J.C. Benaloh, “Secret Sharing Homomorphisms: Keeping Shares of a Secret Secret,†Proc. Advances in Cryptology (Crypto), pp. 251-260, 1986.

      [6] J. Brickell and V. Shmatikov, “Privacy-Preserving Graph Algorithms in the Semi-Honest Model,†Proc. 11th Int’l Conf. Theory and Application of Cryptology and Information Security (ASIACRYPT), pp. 236-252, 2005. https://doi.org/10.1007/11593447_13.

      [7] D.W.L. Cheung, J. Han, V.T.Y. Ng, A.W.C. Fu, and Y. Fu, “A Fast Distributed Algorithm for Mining Association Rules,†Proc. Fourth Int’l Conf. Parallel and Distributed Information Systems (PDIS), pp. 31-42, 1996. https://doi.org/10.1109/PDIS.1996.568665.

      [8] D.W.L Cheung, V.T.Y. Ng, A.W.C. Fu, and Y. Fu, “Efficient Mining of Association Rules in Distributed Databases,†IEEE Trans. Knowledge and Data Eng., vol. 8, no. 6, Dec. 1996. https://doi.org/10.1109/69.553158.

      [9] T. ElGamal, “A Public Key Cryptosystem and a Signature Scheme Based on Discrete Logarithms,†IEEE Trans. Information Theory, vol. IT-31, no. 4, July 1985. https://doi.org/10.1007/3-540-39568-7_2.

      [10] A.V. Evfimievski, R. Srikant, R. Agrawal, and J. Gehrke, “Privacy Preserving Mining of Association Rules,†Proc. Eighth ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining (KDD), pp. 217-228, 2002. https://doi.org/10.1145/775047.775080.

      [11] R. Fagin, M. Naor, and P. Winkler, “Comparing Information without Leaking It,†Comm. ACM, vol. 39, pp. 77-85, 1996. https://doi.org/10.1145/229459.229469.

      [12] M. Freedman, Y. Ishai, B. Pinkas, and O. Reingold, “Keyword Search and Oblivious Pseudorandom Functions,†Proc. Second Int’l Conf. Theory of Cryptography (TCC), pp. 303-324, 2005. https://doi.org/10.1007/978-3-540-30576-7_17.

      [13] M.J. Freedman, K. Nissim, and B. Pinkas, “Efficient Private Matching and Set Intersection,†Proc. Int’l Conf. Theory and Applications of Cryptographic Techniques (EUROCRYPT), pp. 1-19, 2004. https://doi.org/10.1007/978-3-540-24676-3_1.

      [14] O. Goldreich, S. Micali, and A. Wigderson, “How to Play Any Mental Game or a Completeness Theorem for Protocols with Honest Majority,†Proc. 19th Ann. ACM Symp. Theory of Computing (STOC), pp. 218-229, 1987.

      [15] H. Grosskreutz, B. Lemmen, and S. R€uping, “Secure Distributed Subgroup Discovery in Horizontally Partitioned Data,†Trans. Data Privacy, vol. 4, no. 3, pp. 147-165, 2011.

      [16] W. Jiang and C. Clifton, “A Secure Distributed Framework for Achieving k-Anonymity,†The VLDB J., vol. 15, pp. 316-333, 2006. https://doi.org/10.1007/s00778-006-0008-z.

      [17] M. Kantarcioglu and C. Clifton, “Privacy-Preserving Distributed Mining of Association Rules on Horizontally Partitioned Data,†IEEE Trans. Knowledge and Data Eng., vol. 16, no. 9, pp. 1026-1037,Sept. 2004. https://doi.org/10.1109/TKDE.2004.45.

      [18] M. Kantarcioglu, R. Nix, and J. Vaidya, “An Efficient Approximate Protocol for Privacy-Preserving Association Rule Mining,†Proc. 13th Pacific-Asia Conf. Advances in Knowledge Discovery and Data Mining (PAKDD), pp. 515-524, 2009. https://doi.org/10.1007/978-3-642-01307-2_48.

      [19] L. Kissner and D.X. Song, “Privacy-Preserving Set Operations,†Proc. 25th Ann. Int’l Cryptology Conf. (CRYPTO), pp. 241-257, 2005.

  • Downloads

  • How to Cite

    K. Manikandan, N., & Manivannan, D. (2018). Frequent item set mining using normalized FP-growth algorithm. International Journal of Engineering & Technology, 7(1.7), 59-61. https://doi.org/10.14419/ijet.v7i1.7.9573

    Received date: 2018-02-17

    Accepted date: 2018-02-17

    Published date: 2018-02-05