An enhanced privacy preserving approach with enforcing policies for processing big data in spark framework

  • Authors

    • S. Revathy VIT
    • Dr. Arunkumar Thangavel VIT
    2018-08-25
    https://doi.org/10.14419/ijet.v7i3.10.15613
  • Big Data privacy, Enhanced Random Forest (ERF) Classification, Modified Incognito Anonymization based Privacy Preservation (MIA-PP), Improved FP-Growth (IFP-G) and Confidentiality.
  • Ensuring the privacy for the big data stored in a cloud system is one of the demanding and critical process in recent days. Generally, the big data contains a huge amount of data, which requires some security measures and rules for assuring the confidentiality. For this reason, different techniques have been developed in the traditional works, which intends to guarantee the privacy of the big data by implementing key generation, encryption, and anonymization mechanisms. But it limits the issues of increased time consumption, computational complexity, and error rate. Thus, the proposed work aims to design an enhanced mechanism for a secure big data storage. Here, the user’s bank dataset is considered as the input, which is protected from the unauthorized users by guaranteeing both the privacy and secrecy of the data. Here, the raw dataset is preprocessed to increase the data quality and correctness. Then, the security policies (i.e. rules) are generated for allowing the restricted access on the data by using an Improved FP-Growth (IFP-G) algorithm. Consequently, the sensitive and non-sensitive data attributes are classified based on the extracted features by using an Enhanced Random Forest (ERF) classification technique. At last, the privacy of user’s personal information and other details are protected with the use of a Modified Incognito Anonymization based Privacy Preservation (MIA-PP) algorithm. These enhanced mechanisms guarantee the security and confidentiality of the big data with reduced time consumption and increased accuracy. During experimental evaluation, the results of the proposed privacy mechanism is analyzed and compared by using different measures. Also, some of the existing anonymization and classification techniques have been considered to prove the betterment of the proposed technique.

     

     

  • References

    1. [1] G. Manogaran, et al., "MetaCloudDataStorage architecture for big data security in cloud computing," Procedia Computer Science, vol. 87, pp. 128-133, 2016. https://doi.org/10.1016/j.procs.2016.05.138.

      [2] J. Baek, et al., "A secure cloud computing based framework for big data information management of smart grid," IEEE transactions on cloud computing, vol. 3, pp. 233-244, 2015. https://doi.org/10.1109/TCC.2014.2359460.

      [3] E. S. A. Ahmed and R. A. Saeed, "A survey of big data cloud computing security," International Journal of Computer Science and Software Engineering (IJCSSE), vol. 3, pp. 78-85, 2014.

      [4] P. Purohit, et al., "Big Data in Cloud Computing," International Journal of Advance Research, Ideas and Innovations in Technology, vol. 3, pp. 1312-1318, 2017.

      [5] C. Liu, et al., "MuR-DPA: Top-down levelled multi-replica merkle hash tree based secure public auditing for dynamic big data storage on cloud," IEEE Transactions on Computers, vol. 64, pp. 2609-2622, 2015. https://doi.org/10.1109/TC.2014.2375190.

      [6] D. Puthal, et al., "A secure big data stream analytics framework for disaster management on the cloud," in High Performance Computing and Communications; IEEE 14th International Conference on Smart City, 2016, pp. 1218-1225. https://doi.org/10.1109/HPCC-SmartCity-DSS.2016.0170.

      [7] M. Bahrami and M. Singhal, "The role of cloud computing architecture in big data," in Information granularity, big data, and computational intelligence, ed: Springer, 2015, pp. 275-295. https://doi.org/10.1007/978-3-319-08254-7_13.

      [8] H. Bagheri and A. A. Shaltooki, "Big Data: challenges, opportunities and Cloud based solutions," International Journal of Electrical and Computer Engineering, vol. 5, p. 340, 2015.

      [9] S. Li, et al., "A sticky policy framework for big data security," in Big Data Computing Service and Applications (BigDataService), 2015 IEEE First International Conference on, 2015, pp. 130-137. https://doi.org/10.1109/BigDataService.2015.71.

      [10] L. Xu, et al., "Information security in big data: privacy and data mining," IEEE access, vol. 2, pp. 1149-1176, 2014. https://doi.org/10.1109/ACCESS.2014.2362522.

      [11] X. Zhang, et al., "A scalable two-phase top-down specialization approach for data anonymization using mapreduce on cloud," IEEE Transactions on Parallel and Distributed Systems, vol. 25, pp. 363-373, 2014. https://doi.org/10.1109/TPDS.2013.48.

      [12] M. A. Ferrag, et al., "A Systematic Review of Data Protection and Privacy Preservation Schemes for Smart Grid Communications," Sustainable Cities and Society, 2018. https://doi.org/10.1016/j.scs.2017.12.041.

      [13] Z. Liu, et al., "Practical-oriented protocols for privacy-preserving outsourced big data analysis: Challenges and future research directions," Computers & Security, vol. 69, pp. 97-113, 2017. https://doi.org/10.1016/j.cose.2016.12.006.

      [14] A. Anjum, et al., "Privacy preserving data by conceptualizing smart cities using MIDR-Angelization," Sustainable Cities and Society, vol. 40, pp. 326-334, 2018. https://doi.org/10.1016/j.scs.2018.04.014.

      [15] M. Anisetti, et al., "Privacy-aware Big Data Analytics as a service for public health policies in smart cities," Sustainable Cities and Society, vol. 39, pp. 68-77, 2018. https://doi.org/10.1016/j.scs.2017.12.019.

      [16] R. Jiang, et al., "Achieving high performance and privacy-preserving query over encrypted multidimensional big metering data," Future Generation Computer Systems, vol. 78, pp. 392-401, 2018. https://doi.org/10.1016/j.future.2016.05.005.

      [17] Z. El Ouazzani and H. El Bakkali, "A new technique ensuring privacy in big data: K-anonymity without prior value of the threshold k," Procedia Computer Science, vol. 127, pp. 52-59, 2018. https://doi.org/10.1016/j.procs.2018.01.097.

      [18] Z. Guan and G. Si, "Achieving privacy-preserving big data aggregation with fault tolerance in smart grid," Digital Communications and Networks, vol. 3, pp. 242-249, 2017. https://doi.org/10.1016/j.dcan.2017.08.005.

      [19] R. Li, et al., "A distributed authentication and authorization scheme for in-network big data sharing," Digital Communications and Networks, vol. 3, pp. 226-235, 2017. https://doi.org/10.1016/j.dcan.2017.06.001.

      [20] S. Li, et al., "CExp: secure and verifiable outsourcing of composite modular exponentiation with single untrusted server," Digital Communications and Networks, vol. 3, pp. 236-241, 2017. https://doi.org/10.1016/j.dcan.2017.05.001.

      [21] D. Wu, et al., "Scalable privacy-preserving big data aggregation mechanism," Digital Communications and Networks, vol. 2, pp. 122-129, 2016. https://doi.org/10.1016/j.dcan.2016.07.001.

      [22] D. S. Terzi, et al., "A survey on security and privacy issues in big data," in Internet Technology and Secured Transactions (ICITST), 2015 10th International Conference for, 2015, pp. 202-207. https://doi.org/10.1109/ICITST.2015.7412089.

      [23] K. Liang, et al., "Privacy-preserving ciphertext multi-sharing control for big data storage," IEEE transactions on information forensics and security, vol. 10, pp. 1578-1589, 2015. https://doi.org/10.1109/TIFS.2015.2419186.

      [24] K. Xu, et al., "Privacy-preserving machine learning algorithms for big data systems," in Distributed Computing Systems (ICDCS), 2015 IEEE 35th International Conference on, 2015, pp. 318-327. https://doi.org/10.1109/ICDCS.2015.40.

      [25] S. Yu, "Big privacy: Challenges and opportunities of privacy study in the age of big data," IEEE access, vol. 4, pp. 2751-2763, 2016. https://doi.org/10.1109/ACCESS.2016.2577036.

      A. Samuel, et al., "A framework for composition and enforcement of privacy-aware and context-driven authorization mechanism for multimedia big data," IEEE Transactions on Multimedia, vol. 17, pp. 1484-1494, 2015. https://doi.org/10.1109/TMM.2015.2458299
  • Downloads

  • How to Cite

    Revathy, S., & Arunkumar Thangavel, D. (2018). An enhanced privacy preserving approach with enforcing policies for processing big data in spark framework. International Journal of Engineering & Technology, 7(4), 7086-7093. https://doi.org/10.14419/ijet.v7i3.10.15613