Gene selection in Cox regression model based on a new adaptive penalized method

  • Authors

    • Oday Isam Alskal University of Mosul
    • Zakariya Yahya Algamal University of Mosul
    2020-05-15
    https://doi.org/10.14419/ijasp.v8i1.30566
  • Cox Regression Model, Penalized Method, LASSO, Gene Selection.
  • Abstract

    The common issues of high dimensional gene expression data for survival analysis are that many of genes may not be relevant to their diseases. Gene selection has been proved to be an effective way to improve the result of many methods. The Cox proportional hazards regression model is the most popular model in regression analysis for censored survival data. In this paper, an adaptive penalized Cox proportional hazards regression model is proposed, with the aim of identification relevant genes and provides high classification accuracy, by combining the Cox proportional hazards regression model with the weighted least absolute shrinkage and selection operator (LASSO) method. Experimental results show that the proposed method significantly outperforms two competitor methods in terms of the area under the curve and the number of the selected genes.

     

     

  • References

    1. [1] Algamal, Z. Y., & Lee, M. H. (2015a). Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification. Expert Systems with Applications, 42(23), 9326–9332. https://doi.org/10.1016/j.eswa.2015.08.016.

      [2] Algamal, Z. Y., & Lee, M. H. (2015b). Regularized logistic regression with adjusted adaptive elastic net for gene selection in high dimensional cancer classification. Computers in Biology and Medicine, 67, 136-145. https://doi.org/10.1016/j.compbiomed.2015.10.008.

      [3] Beer, D. G., Kardia, S. L., Huang, C.-C., Giordano, T. J., Levin, A. M., Misek, D. E., . . . Thomas, D. G. (2002). Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nature medicine, 8(8), 816. https://doi.org/10.1038/nm733.

      [4] Bradic, J., Fan, J., & Jiang, J. (2011). Regularization for Cox's Proportional Hazards Model with Np-Dimensionality. Ann Stat, 39(6), 3092-3120. https://doi.org/10.1214/11-AOS911.

      [5] Cockeran, M., Meintanis, S. G., & Allison, J. S. (2019). Goodness-of-fit tests in the Cox proportional hazards model. Communications in Statistics - Simulation and Computation, 1-12. https://doi.org/10.1080/03610918.2019.1639738.

      [6] Cox, D. R. (1972). Regression models and lifeâ€tables. Journal of the Royal Statistical Society: Series B (Methodological), 34(2), 187-202. https://doi.org/10.1111/j.2517-6161.1972.tb00899.x.

      [7] Du, P., Ma, S., & Liang, H. (2010). Penalized Variable Selection Procedure for Cox Models with Semiparametric Relative Risk. Ann Stat, 38(4), 2092-2117. https://doi.org/10.1214/09-AOS780.

      [8] Emura, T., Chen, Y. H., & Chen, H. Y. (2012). Survival prediction based on compound covariate under Cox proportional hazard models. PLoS One, 7(10), e47627. https://doi.org/10.1371/journal.pone.0047627.

      [9] Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348-1360. https://doi.org/10.1198/016214501753382273.

      [10] Fu, Z., Parikh, C. R., & Zhou, B. (2017). Penalized variable selection in competing risks regression. Lifetime Data Anal, 23(3), 353-376. https://doi.org/10.1007/s10985-016-9362-3.

      [11] Goeman, J. J. (2010). L1 penalized estimation in the Cox proportional hazards model. Biom J, 52(1), 70-84. https://doi.org/10.1002/bimj.200900028.

      [12] Gui, J., & Li, H. (2005). Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data. Bioinformatics, 21(13), 3001-3008. https://doi.org/10.1093/bioinformatics/bti422.

      [13] Hossain, S., & Ahmed, S. E. (2014). Penalized and Shrinkage Estimation in the Cox Proportional Hazards Model. Communications in Statistics - Theory and Methods, 43(5), 1026-1040. https://doi.org/10.1080/03610926.2013.826368.

      [14] Hou, W., Song, L., Hou, X., & Wang, X. (2013). Penalized Empirical Likelihood via Bridge Estimator in Cox's Proportional Hazard Model. Communications in Statistics - Theory and Methods, 43(2), 426-440. https://doi.org/10.1080/03610926.2012.657325.

      [15] Huang, H. H., & Liang, Y. (2018). Hybrid L1/2 +2 method for gene selection in the Cox proportional hazards model. Comput Methods Programs Biomed, 164, 65-73. https://doi.org/10.1016/j.cmpb.2018.06.004.

      [16] Huang, J., Liu, L., Liu, Y., & Zhao, X. (2014). Group selection in the Cox model with a diverging number of covariates. statistica Sinica. https://doi.org/10.5705/ss.2013.061.

      [17] Huang, J., Sun, T., Ying, Z., Yu, Y., & Zhang, C. H. (2013). Oracle Inequalities for the Lasso in the Cox Model. Ann Stat, 41(3), 1142-1165. https://doi.org/10.1214/13-AOS1098.

      [18] Jiang, H. K., & Liang, Y. (2018). The L1/2 regularization network Cox model for analysis of genomic data. Comput Biol Med, 100, 203-208. https://doi.org/10.1016/j.compbiomed.2018.07.009.

      [19] Karabey, U., & Tutkun, N. A. (2017). Model selection criterion in survival analysis. 1863, 120003. https://doi.org/10.1063/1.4992296.

      [20] Kauermann, G. (2005). Penalized spline smoothing in multivariable survival models with varying coefficients. Computational Statistics & Data Analysis, 49(1), 169-186. https://doi.org/10.1016/j.csda.2004.05.006.

      [21] Leng, C., & Helen Zhang, H. (2006). Model selection in nonparametric hazard regression. Journal of Nonparametric Statistics, 18(7-8), 417-429. https://doi.org/10.1080/10485250601027042.

      [22] Li, Y., Dicker, L., & Zhao, S. D. (2014). The Dantzig Selector for Censored Linear Regression Models. Stat Sin, 24(1), 251-2568. https://doi.org/10.5705/ss.2011.220.

      [23] Lin, C. Y., & Halabi, S. (2017). A Simple Method for Deriving the Confidence Regions for the Penalized Cox's Model via the Minimand Perturbation. Commun Stat Theory Methods, 46(10), 4791-4808. https://doi.org/10.1080/03610926.2015.1085568.

      [24] Liu, C., Liang, Y., Luan, X.-Z., Leung, K.-S., Chan, T.-M., Xu, Z.-B., & Zhang, H. (2014). The L1/2 regularization method for variable selection in the Cox model. Applied Soft Computing, 14, 498-503. https://doi.org/10.1016/j.asoc.2013.09.006.

      [25] Park, E., & Ha, I. D. (2018). Penalized variable selection for accelerated failure time models. Communications for Statistical Applications and Methods, 25(6), 591-604. https://doi.org/10.29220/CSAM.2018.25.6.591.

      [26] Rosenwald, A., Wright, G., Chan, W. C., Connors, J. M., Campo, E., Fisher, R. I., . . . Giltnane, J. M. (2002). The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. New England Journal of Medicine, 346(25), 1937-1947. https://doi.org/10.1056/NEJMoa012914.

      [27] Shi, Y., Xu, D., Cao, Y., & Jiao, Y. (2019). Variable Selection via Generalized SELO-Penalized Cox Regression Models. Journal of Systems Science and Complexity, 32(2), 709-736. https://doi.org/10.1007/s11424-018-7276-8.

      [28] Simon, N., Friedman, J., Hastie, T., & Tibshirani, R. (2011). Regularization paths for Cox’s proportional hazards model via coordinate descent. Journal of Statistical Software, 39(5), 1. https://doi.org/10.18637/jss.v039.i05.

      [29] Suchting, R., Hebert, E. T., Ma, P., Kendzor, D. E., & Businelle, M. S. (2019). Using Elastic Net Penalized Cox Proportional Hazards Regression to Identify Predictors of Imminent Smoking Lapse. Nicotine Tob Res, 21(2), 173-179. https://doi.org/10.1093/ntr/ntx201.

      [30] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 58(1), 267-288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x.

      [31] Tibshirani, R. (1997). The lasso method for variable selection in the Cox model. Statistics in medicine, 16(4), 385-395. https://doi.org/10.1002/(SICI)1097-0258(19970228)16:4<385::AID-SIM380>3.0.CO;2-3.

      [32] van Houwelingen, H. C., Bruinsma, T., Hart, A. A., van't Veer, L. J., & Wessels, L. F. (2006). Crossâ€validated Cox regression on microarray gene expression data. Statistics in medicine, 25(18), 3201-3216. https://doi.org/10.1002/sim.2353.

      [33] Wang, D., Wu, T. T., & Zhao, Y. (2019). Penalized empirical likelihood for the sparse Cox regression model. Journal of Statistical Planning and Inference, 201, 71-85. https://doi.org/10.1016/j.jspi.2018.12.001.

      [34] Wu, T. T., Gong, H., & Clarke, E. M. (2012). A Transcriptome Analysis by Lasso Penalized Cox Regression for Pancreatic Cancer Survival. Journal of Bioinformatics and Computational Biology, 09(supp01), 63-73. https://doi.org/10.1142/S0219720011005744.

      [35] Zhang, H. H., & Lu, W. (2007). Adaptive Lasso for Cox's proportional hazards model. Biometrika, 94(3), 691-703. https://doi.org/10.1093/biomet/asm037.

      [36] Zou, H. (2006). The Adaptive Lasso and Its Oracle Properties. Journal of the American Statistical Association, 101(476), 1418-1429. https://doi.org/10.1198/016214506000000735.

      Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301-320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
  • Downloads

  • Received date: 2020-03-27

    Accepted date: 2020-05-07

    Published date: 2020-05-15