In recent years, with the rapid development of big data mining technology in the medical industry, clinical precision therapy has become a research hotspot in the field of medical big data. In this study, based on the breast cancer dataset in the UCI database, a breast cancer dichotomous classification algorithm was constructed to predict breast tumour types. Among them, machine learning techniques including random oversampling algorithm, Least absolute shrinkage and selection operator (Lasso) regression for feature selection, and sequential forward selection (SFS) for feature selection algorithm were used for the processing of imbalanced dataset, optimisation of feature selection algorithm and evaluation of classification accuracy. The results showed that the random forest algorithm containing six of these features had the highest classification accuracy (97.07%), which improved the accuracy relative to the algorithm without feature selection and could potentially provide new ideas in breast cancer detection.
冯欣, 张航 , 辛瑞昊.
基于Lasso特征选择乳腺癌二分类算法研究
[J]. 吉林化工学院学报, 2023, 40(1): 23-28.
FENG Xin , ZHANG Hang , XIN Ruihao .
A Study on the Lasso Feature-based Selection Algorithm for Breast Cancer Binary Classification
. Journal of Jilin Institute of Chemical Technology, 2023, 40(1): 23-28.