LASSO REGRESSION AND AN APPLICATION IN BREAST CANCER DATA ANALYSIS

  • Nông Quỳnh Vân, Trần Đình Hùng
Keywords: Regression; Ordinary least square; LASSO; L1 regularization; Penalized regression; Breast cancer

Abstract

The LASSO is one of the regularized regression methods proposed by Tibshirani in 1996. The goal of LASSO is to select and estimate parameters in a linear regression model by exactly shrinking some coefficients to zero. In particular, the LASSO is useful in analyzing microarray gen data in which the number of predictors (genes) is much larger than the number of sample observations (number of patients). In this paper, we introduce a brief summary of the LASSO and apply this  method to study gene in breast cancer data. The aim was to assess the genes interactions associated with breast cancer microarray data. The results show that the LASSO method performs relatively well in analyzing gene expression levels and indicates genes that related to the breast cancer gene BRCA1 such as genes NBR2, AASDH, KIAA2013, VPS25, NBR1, SEC22C, RPL27, CBLN3, KHDRBS1, XRCC2. In fact, the NBR2 gene is adjacent to BRCA1 on chromosome 17, and two genes share the same promoter region. Thus, breast cancer prognosis determined by regression will help us to better understand the mechanism underlying the occurrence of breast cancer of young women.

điểm /   đánh giá
Published
2022-05-31
Section
NATURAL SCIENCE – ENGINEERING – TECHNOLOGY