Please wait a minute...
吉林化工学院学报, 2021, 38(9): 107-111     https://doi.org/10.16039/j.cnki.cn22-1249.2021.09.020
  本期目录 | 过刊浏览 | 高级检索 |
基于模糊数学的高维稀疏数据聚类统计方法设计
周燕茹
巢湖学院 数学与统计学院,安徽 巢湖  238000
Design of Clustering Statistics Method for High-dimensional Sparse Data based on Fuzzy Mathematics
ZHOU Yanru
下载:  PDF (360KB) 
输出:  BibTeX | EndNote (RIS)      
摘要 

传统的数据聚类统计方法仅适用于低维数据聚类问题,为此,本研究设计了基于模糊数据的高维稀疏数据聚类统计方法,以期提升高维稀疏数据的聚类统计效果。以模糊C均值聚类算法为基础,通过优化初始聚类中心解决局部最优问题,缩短聚类统计时间;然后引入权重机制,令该方法适用于高维稀疏数据聚类统计。基于此,以余弦距离替换原有的欧几里德距离,提高高维稀疏数据聚类统计效果。实验证明:在数据维度不同时,该方法均有较优的聚类统计效果。当数据维度较低时,分块比例为10%时聚类统计效果最优;当数据维度较高时,分块比例为40%时聚类统计效果最优。在不同稀疏度等级时,该方法的命中率和聚类统计效率均较高。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
周燕茹
关键词:  模糊数学  高维稀疏数据  聚类统计  模糊C均值  聚类中心  余弦距离     
Abstract: 

Traditional data clustering statistics method is only applicable to low dimensional data clustering problem, therefore, this study designed a high-dimensional sparse data clustering based on fuzzy data statistical method, the clustering of high-dimensional sparse data statistics results. Based on the fuzzy c-means clustering algorithm, by optimizing the initial clustering center, solve the problem of local optimum, shorten the clustering statistics time; Then weighting mechanism are introduced, the method is suitable for high-dimensional sparse data clustering statistics. Based on this, in order to replace the original Euclidean distance, cosine distance to improve the effect of high-dimensional sparse data clustering statistics. Experiments show: the data dimension is not at the same time, this method has a better clustering effect of statistics. When data dimension is low, partitioned clustering statistics result when compared with 10% of the optimal; When high dimension data, block ratio is 40% when the optimal clustering statistics effect. In the sparse degree of different grade, the shooting and cluster statistical efficiency of the method are high.

Key words:  fuzzy mathematics    high-dimensional sparse data    clustering statistics    fuzzy c-means    the clustering center    cosine distance
               出版日期:  2021-09-25      发布日期:  2021-09-25      整期出版日期:  2021-09-25
ZTFLH:  TP391  
引用本文:    
周燕茹. 基于模糊数学的高维稀疏数据聚类统计方法设计 [J]. 吉林化工学院学报, 2021, 38(9): 107-111.
ZHOU Yanru. Design of Clustering Statistics Method for High-dimensional Sparse Data based on Fuzzy Mathematics . Journal of Jilin Institute of Chemical Technology, 2021, 38(9): 107-111.
链接本文:  
http://xuebao.jlict.edu.cn/CN/10.16039/j.cnki.cn22-1249.2021.09.020  或          http://xuebao.jlict.edu.cn/CN/Y2021/V38/I9/107
[1] 杨雅芳. 一种基于小波去噪的遥感图像显著性区域检测算法 [J]. 吉林化工学院学报, 2021, 38(9): 47-52.
No Suggested Reading articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed