1. Jilin Chemical Industry Hospital, Jilin City Jilin Province 132022,China; 2. Jilin University of Chemical Technology, Jilin City Jilin Province 132022,China
Abstract: In order to solve the dual challenges of high-dimensional feature redundancy and category imbalance in breast cancer risk prediction, this study proposes a new framework CARE-Net based on convolutional neural networks. The model optimizes data distribution by integrating SMOTEENN resampling technology. First, an unsupervised self-encoder is used to compress the features of mRNA, miRNA and DNA methylation data respectively to learn discriminatory low-dimensional embedding representations. Secondly, multi-genomic features are integrated through early fusion strategies, and feature screening is carried out based on correlation threshold analysis to effectively eliminate redundant features. Aiming at the category imbalance problem, SMOTEENN mixed sampling technology is introduced for data enhancement before model training. Finally, the classification prediction task is performed based on the optimized feature space. Experimental results show that compared with existing benchmark methods, CARE-Net shows significant advantages in key indicators such as Accuracy, F1 score and AUC value. Compared with classic machine learning methods, this method shows stronger robustness in dealing with feature redundancy elimination and category imbalance processing of multi-set data, and significantly improves the generalization ability of risk prediction models.
张军, 白素丹, 尤涛, 韩波, 魏子瑄, 辛瑞昊, 冯欣. 基于多组学融合的乳腺癌风险预测模型[J]. 吉林化工学院学报, 2025, 42(9): 19-24.
Zhang Jun, Bai Sun-dan, You Tao, Han Bo, Wei Zi-xuan, Xin Rui-hao, Feng Xin. Research on the Reform of Physical Education Teaching in Universities Based on Knowledge Graph. Journal of Jilin Institute of Chemical Technology, 2025, 42(9): 19-24.