MRC-PBM: 一种中文电子病历嵌套命名 实体识别方法
DOI:
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP391.1

基金项目:

国家重点研究发展计划项目(2016YFC0802500) 资助


MRC-PBM:A Chinese electronic medical Record nested named entity recognition method
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    中文电子病历实体包含大量的医学领域词汇并具有明显的嵌套特征。嵌套实体识别时往往存在目标实体定位不完 整、不准确的问题。针对这一问题,提出了一种基于机器阅读理解的中文电子病历嵌套命名实体识别模型 MRC-PBM(ma- chine reading comprehension-position information biaffine and MLP)。该模型将命名实体识别 (named entity recognition, NER) 转化为机器阅读理解任务,将中文电子病历文本和预定义的查询语句串联作为输入,使用基于医学的预训练模型MC_ BERT 获取词向量,然后通过双向长短期记忆网络模型 (BiLSTM) 和多粒度扩张卷积模型分别获取双向的特征信息以及单 词之间的信息,得到相应的特征向量,最后使用Hybrid-PBM 预测器进行实体预测。在嵌套和平面 NER 数据集上进行实验。 实验表明,该模型在糖尿病语料和公开医学数据集上优于其他主流神经网络模型,F1 值比基线模型提高了1.21%~5.80%。

    Abstract:

    The Chinese electronic medical record entities contain a large number of medical domain vocabulary and have obvious nested features.When identifying nested entities,there is often a problem of incomplete or inaccurate location of the target entity.To address this problem,a Chinese electronic medical record nested named entity recognition model machine reading comprehension-position information biaffine and MLP(MRC-PBM),based on MRC is proposed.The model transforms named entity recognition(NER)into an MRC task,concatenating the Chinese EMR text and predefined query statements as input,using the medical-based pre-trained model MC_BERT to obtain word vectors,and then using a bidirectional long short-term memory network(BiLSTM)and a multi-granularity expansion convolution model to obtain bidirectional feature information and information between words,respectively,to obtain corresponding feature vectors.Finally,the Hybrid-PBM predictor is used to predict the entities.Experiments are conducted on nested and flat NER datasets.The experimental results show that the proposed model outperforms other mainstream neural network models on the diabetes corpus and public medical datasets,with Fl scores improved by 1.21%to 5.80% compared to baseline models.

    参考文献
    相似文献
    引证文献
引用本文

周佳伦,李琳宇,马洪彬,姜艳静. MRC-PBM: 一种中文电子病历嵌套命名 实体识别方法[J].国外电子测量技术,2024,43(1):159-165

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-05-28
  • 出版日期:
文章二维码