Data mining based investigation of the impact of imbalanced dataset over fractured zone detection

  • Authors

    • Haleh Azizi School of Electrical Engineering and Computer Sciences, University of North Dakota, Grand Forks, ND, 58201, USA
    • Hassan Reza School of Electrical Engineering and Computer Sciences, University of North Dakota, Grand Forks, ND, 58201, USA
    2021-06-11
    https://doi.org/10.14419/ijet.v10i2.31604
  • Accuracy, Classifier, Fractured Reservoirs, Random Forest, Support Vector Machine.
  • Several studies have been conducted in recent years to discriminate between fractured (FZs) and non-fractured zones (NFZs) in oil wells. These studies have applied data mining techniques to petrophysical logs (PLs) with generally valuable results; however, identifying fractured and non-fractured zones is difficult because imbalanced data is not treated as balanced data during analysis. We studied the importance of using balanced data to detect fractured zones using PLs. We used Random-Forest and Support Vector Machine classifiers on eight oil wells drilled into a fractured carbonite reservoir to study PLs with imbalanced and balanced datasets, then validated our results with image logs. A significant difference between accuracy and precision indicates imbalanced data with fractured zones categorized as the minor class. The results indicated that the accuracy of imbalanced and balanced datasets is similar, but precision is significantly improved by balancing, regardless of how low or high the calculated indices might be.

     

     

  • References

    1. [1] S. Kotsiantis, D. Kanellopoulos, P. Pintelas, Handling Imbalanced Datasets: A Review, GESTS International Transactions on Computer Science and Engineering, 30(1) (2006) 25-36.

      [2] M. Kubat, S. Matwin, Addressing the Curse of Imbalanced Training Sets: One-sided Selection, ICML. (1997).

      [3] A. Al-Shahib, R. Breitling, D. Gilbert, Feature Selection and the Class Imbalance Problem in Predicting Protein Function from Sequence, Applied Bioinformatics, 4(3) (2005) 195-203. https://doi.org/10.2165/00822942-200504030-00004.

      [4] L. Yi-Hung C. Yen-Ting, Total Margin Based Adaptive Fuzzy Support Vector Machines for Multiview Face Recognition. in Systems, Man and Cybernetics, 2005 IEEE International Conference. (2005) https://doi.org/10.1109/ICSMC.2005.1571394.

      [5] M.A. Mazurowski, P.A. Habas, JM. Zurada, et al. Training Neural Network Classifiers for Medical Decision Making: The Effects of Imbalanced Datasets on Classification Performance, Neural networks: The Official Journal of the International Neural Network Society, 21(2-3) (2008) 427-436. https://doi.org/10.1016/j.neunet.2007.12.031.

      [6] Z.B. Zhu Z.H. Song, Fault Diagnosis Based on Imbalance Modified Kernel Fisher Discriminant Analysis, Chemical Engineering Research and Design, 88(8) (2010) 936-951. https://doi.org/10.1016/j.cherd.2010.01.005.

      [7] M. Tavallaee, N. Stakhanova, A.A. Ghorbani, Toward Credible Evaluation of Anomaly-Based Intrusion-Detection Methods, Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions, 40(5) (2010) 516-524. https://doi.org/10.1109/TSMCC.2010.2048428.

      [8] Y. Li, G. Sun, Y. Zhu, Data Imbalance Problem in Text Classification, Information Processing (ISIP), Third International Symposium, IEEE. (2010) https://doi.org/10.1109/ISIP.2010.47.

      [9] Y. Sun, A.K.C. Wong, M.S. Kamel, Classification of Imbalanced Data: A Review, International Journal of Pattern Recognition and Artificial Intelligence, 23(4) (2009) https://doi.org/10.1142/S0218001409007326.

      [10] A. Ali, S.M. Shamsuddin, A.L. Ralescu, Classification with Class Imbalance Problem: A Review, International Journal of Advance Soft Computation Application, 5(3) (2013) 1-30.

      [11] X.Q. Ouyang, Y.P. Chen, B.H. Wei, Experimental Study on Class Imbalance Problem Using an Oil Spill Training Data Set, British Journal of Mathematics and Computer Science, 21(5) (2017) 1-9. https://doi.org/10.9734/BJMCS/2017/32860.

      [12] J.M. Johnson, T.M. Khoshgoftaar, Survey on Deep Learning with Class Imbalance, Journal of Big Data, 6(27) (2019) 1-54. https://doi.org/10.1186/s40537-019-0192-5.

      [13] N. Klyuchnikov, A. Zaytsev, A. Gruzdev, et al., Data-driven Model for the Identification of the Rock Type at a Drilling Bit, Journal of Petroleum Science and Engineering, 178 (2019) 506-516. https://doi.org/10.1016/j.petrol.2019.03.041.

      [14] M. Pirizadeh, N. Alemohammad, M. Manthouri, et al., A New Machine Learning Ensemble Model for Class Imbalance Problem of Screening Enhanced Oil Recovery Methods, Journal of Petroleum Science and Engineering, Available online, 108214. (2020) https://doi.org/10.1016/j.petrol.2020.108214.

      [15] J. Brownlee, How to Develop an Imbalanced Classification Model to Detect Oil Spills, https://machinelearningmastery.com/imbalanced-classification-model-to-detect-oil-spills/ 2020.

      [16] M. Nemati, H. Pezeshk, Spatial Distribution of Fractures in the Asmari Formation of Iran in Subsurface Environment: Effect of Lithology and Petrophysical Properties, Natural Resources Research, 14 (2005) 305-316. https://doi.org/10.1007/s11053-006-9000-y.

      [17] A.R. Mohebbi, M. Haghighi, M. Sahimi, Using Conventional Logs for Fracture Detection and Characterization in One of Iranian Field, International Petroleum Technology Conference held in Dubai, U.A.E., 4-6 December 2007, Paper IPTC 11186. https://doi.org/10.3997/2214-4609-pdb.147.iptc11186.

      [18] M. Sahimi, M. Hashemi Wavelet Identification of the Spatial Distribution of Fractures, Geophysical Reservoir Letters, 28(4) (2001) 611-614. https://doi.org/10.1029/2000GL011961.

      [19] M. Daiguji, O. Kudo, T. Wada, Application of Wavelet Analysis to Fault Detection in Oil Refinery, Computers & Chemical Engineering, 21 (1997) S 1117-S 1122. https://doi.org/10.1016/S0098-1354(97)00199-3.

      [20] R.A. Behrens, M.K. Macleod, T.T. Tran, et al., Incorporating Seismic Attribute Maps in 3D Reservoir Models, SPE Reservoir Evaluation, 1 (1998) 122-126. https://doi.org/10.2118/36499-PA.

      [21] L.P. Martinez-Torres, Characterization of Naturally Fractured Reservoirs from Conventional Well Logs, M.Sc. Thesis, University of Oklahoma, USA. (2002).

      [22] N. H. Tran, Characterization and Modeling of Naturally Fractured Reservoirs, Ph.D. Thesis, University of New South Wales, Australia. (2004) https://doi.org/10.1109/ICSMC.2005.1571394.

      [23] B. Tokhmechi, H. Memarian, V. Rasouli, et al., Fracture Zones Detection Using Wavelet Decomposition of Water Saturation Log, Journal of Petroleum Science and Engineering, 69 (2009a) 129-138. https://doi.org/10.1016/j.petrol.2009.08.005.

      [24] B. Tokhmechi, H. Memarian, H. Ahmadi Noubari, et al., A Novel Approach for Fracture Zone Detection Using Petrophysical Logs, Journal of Geophysics and Engineering, 6 (2009b) 365-373. https://doi.org/10.1088/1742-2132/6/4/004.

      [25] S.M. Mazhari, H. Memarian, B. Tokhmechi, A Hybrid Learning Automata and Case-based Reasoning for Fractured Zone Detection, Arabian Journal of Geosciences. (2018) https://doi.org/10.1007/s12517-018-3934-3.

      [26] A. Mazaheri, H. Memarian, B. Tokhmechi, et al., Developing Fracture Measure as an Index of Fracture Impact on Well-logs, Energy Exploration and Exploitation, 33(4) (2015) 555-574. https://doi.org/10.1260/0144-5987.33.4.555.

      [27] A. Mazaheri, H. Memarian, B. Tokhmechi, et al., Cell Size Optimization for Fracture Measure Estimation in Multi-Scale Studies Within Oil Wells, Carbonates and Evaporites, (2019) 261-272. https://doi.org/10.1007/s13146-017-0378-x.

      [28] G. Aghli, B. Soleimani, R. Moussavi-Harami, et al., Fractured Zones Detection Using Conventional Petrophysical Logs by Differentiation Method and its Correlation With Image Logs, Journal of Petroleum Science and Engineering, 142 (2016) 152-162. https://doi.org/10.1016/j.petrol.2016.02.002.

      [29] H. Zarehparvar Ghoochaninejad, M.R. Asef, S.A. Moallemi, Estimation of Fracture Aperture from Petrophysical Logs Using Teaching–learning-based Optimization Algorithm into a Fuzzy Inference System, Journal of Exploration and Production Technology, 8 (2018) 143-154. https://doi.org/10.1007/s13202-017-0396-1.

      [30] L. Weiâ€Meng, Python- Machine Learning, Getting Started with Scikit-learn for Machine Learning, Chapter 5, John Wiley & Sons, Inc. (2019) 93-117. https://doi.org/10.1002/9781119557500.ch5.

      [31] J. Heaton, I. Goodfellow, Y. Bengio, et al., Deep Learning, Genetic Programming and Evolvable Machines, 19(1-2) (2017) 305–307. https://doi.org/10.1007/s10710-017-9314-z.

      [32] A. Zhang, Z.C. Lipton, M. Li, et al., Dive into Deep Learning. https://d2l.ai/index.html, (2021) https://doi.org/10.1021/acs.jcim.0c00073.s001.

      [33] S. Theodoridis, K. Koutroumbos, Pattern Classification, 2nd Edition, San Diego: Elsevier/Academic, 2002.

  • Downloads

  • How to Cite

    Azizi, H., & Reza, H. (2021). Data mining based investigation of the impact of imbalanced dataset over fractured zone detection. International Journal of Engineering & Technology, 10(2), 116-123. https://doi.org/10.14419/ijet.v10i2.31604