Privacy Preserving Decomposable Mining Association Rules on Distributed Data

 
 
 
  • Abstract
  • Keywords
  • References
  • PDF
  • Abstract


    In distributed computing, data sharing is inevitable, however, moving local databases from one site to another should be avoided because of the computational overhead and privacy consideration. Most of the data mining algorithms are designed assuming that data repository is stored locally. This paper presents a scheme and algorithms for mining association rules in geographically distributed data. The proposed scheme preserves data privacy of the different geographical site by passing secure messages between them. The algorithms minimize the communication cost by exchanging statistical summaries of the local databases. We provide a privacy and security analysis that shows the privacy preserving aspects of the proposed algorithms. Moreover, the paper presents extensive simulation experiments to evaluate the efficiency of the proposed scheme.

     

     


  • Keywords


    Decomposable algorithm, secure data mining, association rules, vertically and horizontally distributed database.

  • References


      [1] Atallah M. Bertino E., Elmagarmid A., Ibrahim M., and Verykios V., (1999), Disclosure limitation of sensitive rules,” Proceedings of the Workshop on Knowledge and Data Engineering Exchange, Chicago, IL, November, pp.45-52.

      [2] Agrawal T. Imielinski, and A. Swami, (1993),Mining association rules between sets of items in large databases, In SIGMOD 93 International Csonference on Management of Data, pp. 207-216.

      [3] Agrawal, R. and R. Srikant, (1994), Fast algorithms for mining association rules,” Proceedings of the 20th International Conference on Very Large Data Bases, Santiago, Chile, September, pp.487-499,.

      [4] Agrawal R. and Shafer J.C, (1996), Parallel mining of association rules,” IEEE Transactions on Knowledge and Data Engineering, Vol. 8, No. 6, pp.929-969.

      [5] Cheung D. W.-L., Han, J., Ng, V. T. Y., Fu A. W.-C. and Fu, Y. (1996), A fast distributed algorithm for mining association rules,” Proceedings of the 1996 International Conference on Parallel and Distributed Information Systems, Miami Beach, Florida, December, pp.21-42.

      [6] Savasere, A., Omiecinski, E. and Navathe, S., (1995), An efficient algorithm for mining association rules in large databases,” Proceedings of the 21th International Conference on Very Large Data Bases, Zurich, Switzerland, September, pp.432-444.

      [7] Brickell J. and Shmatikov V., (2005), Privacy-preserving graph algorithm in the semi-honest model,” in Roy, B. (Ed.): Lecture Notes in Compute Science, Vol. 3788, Springer-Verlag, pp.236-252.

      [8] Canetti R., (2000), Security and composition of multiparty cryptographic protocols,” Journal of Cryptology Vol. 13, No. 1, pp.143-202.

      [9] Lindell, Y. and Pinkas, B., (2002), Privacy preserving data mining, Journal of Cryptology, Vol. 15, No. 3, pp.177-206.

      [10] Dwork, C. and Nissim, K., (2004), “Privacy-preserving data mining on vertically partitioned databases,” in Franklin, M.K. (Ed.): Lecture Notes in Computer Science, Vol. 3152, Springer-Verlag, pp.528-544.

      [11] Kantarcioglu, M. and Clifton, C., (2004), “Privacy- preserving distributed mining of association rules on horizontally partition data,” IEEE Transactions on Knowledge and Data Engineering, Vol. 16, No. 9, pp.1026-1037.

      [12] Veloso, A.A., Meira Jr., W., Parthasarathy, S. and de Carvalho, M.B., (2003), “Efficient, accurate and privacy- preserving data mining for frequent itemsets in distributed databases,” Proceedings of the Brazilian Symposium on Databases, Manaus, Amazonas, Brazil, October, pp.281- 292.

      [13] Chang, C.-C. and Lin, C.-Y. (2005), Perfect hashing schemes for mining association Rules,” The Computer Journal, Vol. 48, No. 2, pp.168-179.

      [14] Park, J.S., Chen, M.-S. and Yu, P.S., (1995), An effective hash-based algorithm for mining association rules,” Proceedings of the 1995 ACM-SIGMOD International Conference on Management of Data, San Jose, CA, May, pp.175-186.

      [15] Agarwal, R.C., Aggarwal,C.C. and Prasad,V., (2001), A tree projection algorithm for generation of frequent itemsets,” Journal of Parallel and Distributed Computing, Vol. 61, No. 3, pp.350-371.

      [16] Grahne, G. and Zhu, J., (2005), Fast algorithms for frequent itemset mining using FP-trees,” IEEE Transactions on Knowledge and Data Engineering, Vol. 17, No. 10, pp.1347-1362.

      [17] Han, J., Pei, J., Yin, Y. and R. Mao, (2004), Mining frequent pattern without candidate generation: a frequent pattern tree approach, Data Mining and Knowledge Discovery, Vol. 8, No. 1, pp.53-87.

      [18] Li, Y.-C. and C-C. Chang, (2004), A new FP-tree algorithm for mining frequent itemsets,” in Chi, C.-H. and Lam, K.-Y. (Eds.): Lecture Notes in Computer Science, Vol. 3309, Springer-V erlag, pp.266-277.

      [19] Khedr A. M. and R. Bhatnagar, (2007), Agents for Integrating Distributed Data for Complex Computations, Computing and Informatics Journal , vol. 26, No.2, pp. 149-170.

      [20] Khedr A. M., (2011), Nearest Neighbor Clustering over Partitioned Data, Computing and Informatics, Vol. 30, pp. 1001ñ1026.

      [21] Khedr A. M. and A. Salim,(2008), Decomposable Algorithms for Finding the Nearest Pair}, J. Parallel Distrib. Comput. vol.68, pp. 902-912.

      [22] Khedr A. M., Learning k-Classifier from Distributed Databases}, Computing and Informatics Journal, Vol. (27), pp. 355-376, 2008.

      [23] Tamir Tassa, (2011), Secure Mining of Association Rules in Horizontally Distributed Databases, arXiv:1106.5113v1.

      [24] Cheung D. W.-L., J. Han, V. Ng, A. W.-C. Fu, and Y. Fu, (1996), A fast distributed algorithm for mining association rules. In PDIS, page 3142.

      [25] Kantarcioglu M. and C. Clifton, (2004), Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Transactions on Knowledge and Data Engineering,, 16(9):10261037.

      [26] Chia M. H., D. E. Neiman, and V. R. Lesser, (1987), Poaching and distraction in asynchronous agent activities}, ICMAS, pp. 88-95.

      [27] Durfee E. H., and V. R. Corkill, Coherent cooperation among communicating problem solvers, IEEE Transactions on Computers, 36(11), 1987.

      [28] Huhns M. N., Singh M. P., and Ksiezyk T.,(1997) Global Information management via Local Autonomous Agents, Morgan Kaufmann Publishers.

      [29] Khedr, A. M. Decomposable Algorithm for Computing k-Nearest Neighbors across PartitionedData, in: International Journal of Parallel, Emergent and Distributed Systems, vol. 31, no. 4, pp. 334-353, 2016.

      [30] Khedr, A. M. and R. Bhatnagar, Agents for Integrating distributed databases for Complex Computations, Turk. J. Elec. Eng. & Comp. Sci., 14, pp. 313-327, 2006.

      [31] Khedr, A. M. and R. Bhatnagar, New Algorithm for Clustering Distributed Data using k-means, Computing and Informatics, Vol. 33, pp. 1001-1022, 2014.

      [32] Khedr, A. M. and R. Mahmoud, Agents for Integrating Distributed Data for Function Computations, Computing and Informatics, Vol. 31, pp. 1101-1125, 2012.

      [33] Khedr, A. M. and Ahmed Salim, Decomposable Algorithms for Finding the Nearest Pair, J. Parallel Distrib. Comput., Vol. 68, pp. 902-912, 2008.


 

View

Download

Article ID: 16343
 
DOI: 10.14419/ijet.v7i3.13.16343




Copyright © 2012-2015 Science Publishing Corporation Inc. All rights reserved.