成分数据理论和无监督聚类K- means方法提取背景和异常信息——以安徽省兆吉口铅锌矿床为例
作者:
基金项目:

本文为国家自然科学基金委青年科学基金项目(编号41902071 )、国际(地区)合作与交流项目(编号42011530173)和东华理工大学博士科研启动基金项目(编号DHBK2019313)联合资助的成果。


Identification of background and anomaly information via compositional data theory and unsupervised K- means clustering: a case study of Zhaojikou Pb- Zn ore deposit, Anhui Provicne
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    地球化学数据是应用地球化学研究的重要组成部分,是化学勘查工作的基础成果。勘查地球化学数据基本上以元素的质量百分浓度 (简称浓度)的形式表达,是典型的成分数据。其表达的是“组分/总体”相对质量贡献信息,而不是绝对的质量变化信息。浓度数据分布在单纯形空间,而不是整个欧式空间。对成分数据进行处理之前,进行适当的对数比值转换处理可以提高其信息表达。本文以安徽省兆吉口铅锌矿床土壤中Pb数据为示范案例,通过对数比值转换方法优化浓度数据的结构以提高相对信息的表达,并利用无监督学习K- means聚类方法根据对数比值转换数据分布空间质心的距离识别背景和异常信息,最后对K- means聚类方法识别的背景和异常与迭代2倍标准差法和浓度- 面积分形分析法进行比较以衡量其表现。结果表明:浓度数据表达的是相对质量信息,而不是绝对质量关系,不同样品间不能通过比较浓度高低推断出质量的多寡关系。对数比值方法可以有效地提高浓度数据的结构和信息表达,K- means方法能够准确识别对数比值转换数据的背景和异常信息,其效果类似浓度- 面积分形分析方法,比迭代2倍标准差法好。

    Abstract:

    Geochemical data are important part of applied geochemical research and basic achievements of geochemical exploration survey. Exploration geochemical data are mainly expressed in the form of “percentages of element mass concentration (abbreviated to “concentration”)”, which are typical compositional data. It expresses information on relative mass contribution about the ratio of “parts to whole”, rather than information on absolute mass change. The concentration data are distributed in the simplex space, rather than the entire Euclidean space. Before data processing, application of appropriate logarithmic ratio transformation would improve the structure and information representation of compositional data. In this paper, the Pb concentration data in the soil of the Zhaojikou Pb- Zn deposit in the Anhui Province is taken as a case study. The logarithmic ratio transformation methods were used to optimize the data structure of the Pb concentration to improve the expression of relative information. Then the K- means clustering of unsupervised learning methods was adopted to identify the background and abnormal information according to the distances of the centroids of the distribution space of the logarithmic ratio transformation data. Finally, the background and anomaly identified by the K- means clustering method was compared with the results of iterative 2δ method and the concentration- area fractal analysis method to evaluate its performance. The results show that: ① the log- ratio method can effectively improve the structure and information expression of concentration data; ② K- means clustering method can effectively identify the background and abnormal information of log- ratio transformed data; its performance is similar to the concentration- area fractal analysis method, and better than the iterative 2δ method.

    参考文献
    相似文献
    引证文献
引用本文

刘艳鹏,朱立新,马生明,段吉琳,弓秋丽.2022.成分数据理论和无监督聚类K- means方法提取背景和异常信息——以安徽省兆吉口铅锌矿床为例[J].地质学报,96(11):4038-4055.
Liu Yanpeng, Zhu Lixin, Ma Shengming, Duan Jilin, Gong Qiuli.2022. Identification of background and anomaly information via compositional data theory and unsupervised K- means clustering: a case study of Zhaojikou Pb- Zn ore deposit, Anhui Provicne[J]. Acta Geologica Sinica,96(11):4038-4055.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-07-25
  • 最后修改日期:2022-09-22
  • 录用日期:2022-10-08
  • 在线发布日期: 2022-11-07