基于注意力指导的双粒度跨模态医学特征学习框架（计算机工程与科学杂志论文）

作者：陈欣然,刘宁,闫中敏,刘磊,崔立真

关键词：

摘要：深度学习在医学影像诊断中取得显著成果,基于深度神经网络的模型可以有效辅助医生进行决策。然而,随着模型参数规模逐渐增大,且高质量医学影像数据的标签需要专业医师手工

深度学习在医学影像诊断中取得显著成果,基于深度神经网络的模型可以有效辅助医生进行决策。然而,随着模型参数规模逐渐增大,且高质量医学影像数据的标签需要专业医师手工完成,因此大规模参数模型在医疗领域愈发面临数据稀缺的挑战。一种解决方案是引入与医学影像成对的医学报告指导训练,这涉及2种模态的交互,而通用领域的跨模态对齐方法缺乏对细节信息的捕捉,不能完全适用于医疗领域。为解决此问题,提出一种注意力指导的双粒度跨模态医学特征学习框架ADCRL,实现了医学影像和报告在粗粒度和细粒度上的对齐。ADCRL能够提取出医学影像和医学报告2种粒度上的特征,使用注意力指导的模块选择医学任务可能感兴趣的影像区域,并去除噪声区域。通过对比学习式的代理任务实现2个粒度上模态的对齐。ADCRL在无监督范式下训练模型理解2种模态的全局语义和细节语义,并在下游任务中仅使用有限标注数据,即可表现出优秀的性能。主要工作包括提出细粒度特征选择方法和双粒度跨模态特征学习框架,并在公开医疗数据集上预训练并验证了框架的有效性。

Deep learning has achieved significant results in medical imaging diagnosis, and models based on deep neural networks can effectively assist doctors in making decisions. However, as the scale of model parameters gradually increases, large-scale parameter models in the medical domain are increasingly facing the challenge of data scarcity, as the labeling of high-quality medical image data requires professional physicians to manually complete. One solution is to introduce medical report guidance training paired with medical images, which involves the interaction of two modalities. However, cross-modal alignment methods in the general field lack capture of detailed information and cannot be fully applicable to the medical domain. To address this issue, an attention-guided dual-granularity cross-modal medical representation learning framework ADCRL is proposed to achieve alignment of medical images and reports at both coarse-grained and fine-grained levels. ADCRL can extract features from medical images and medical reports at two granularities, use an attention-guided module to select image regions of interest for medical tasks and remove noisy regions, and align two modalities at different granularities through contrastive learning based proxy tasks. ADCRL trains models under unsupervised paradigms to understand the global and detailed semantics of two modalities, and demonstrates excellent performance in downstream tasks using only limited annotated data. The main work include proposing fine-grained feature selection methods and a dual-granularity cross-modal feature learning framework, and pretraining and validating the effectiveness of the framework on publicly available medical datasets.

相关文章

[1]	王杨, 许佳炜, 王傲, 宋世佳, 谢帆, 赵传信, 季一木. 融合交叉序列预测和一致性对比的WiFi人体活动识别[J]. 计算机工程与科学, 2025, 47(01): 160-170.
[2]	敬超, 闭玉申. 面向深度学习作业的干扰感知在线调度算法研究[J]. 计算机工程与科学, 2024, 46(12): 2138-2148.
[3]	刘合兵, 孔玉杰, 席磊, 尚俊平. 融合注意力机制的解耦对比聚类[J]. 计算机工程与科学, 2024, 46(12): 2261-2270.
[4]	陈磊, 梁正友, 孙宇, 蔡俊民. 多尺度特征融合的移动端单目深度估计研究[J]. 计算机工程与科学, 2024, 46(09): 1616-1524.
[5]	刘强, 李沐春, 伍晓洁, 王煜恒. S-JSMA：一种低扰动冗余的快速JSMA对抗样本生成方法[J]. 计算机工程与科学, 2024, 46(08): 1395-1402.
[6]	徐捷, 邵玉斌, 杜庆治, 龙华, 马迪南. 结合混合特征提取与深度学习的长文本语义相似度计算[J]. 计算机工程与科学, 2024, 46(08): 1513-1520.
[7]	田红鹏, 吴璟玮. RIB-NER：基于跨度的中文命名实体识别模型[J]. 计算机工程与科学, 2024, 46(07): 1311-1320.
[8]	胡昭华, 王长富, . 改进Faster R-CNN的遥感图像小目标检测算法[J]. 计算机工程与科学, 2024, 46(06): 1063-1071.
[9]	佟缘, 姚念民. 基于对span的预判断和多轮分类的实体关系抽取[J]. 计算机工程与科学, 2024, 46(05): 916-928.
[10]	李清风, 金柳, 马慧芳, 张若一. 双视图对比学习引导的多行为推荐方法[J]. 计算机工程与科学, 2024, 46(04): 707-715.
[11]	吴一珩, 李军辉, 朱慕华 . 基于多视角对比学习的隐式篇章关系识别[J]. 计算机工程与科学, 2024, 46(04): 716-724.
[12]	范林雨, 李军辉, 孔芳. 基于无监督预训练的跨语言AMR解析[J]. 计算机工程与科学, 2024, 46(01): 170-178.
[13]	易啸, 马胜, 肖侬. 深度学习加速器在不同剪枝策略下的运行优化[J]. 计算机工程与科学, 2023, 45(07): 1141-1148.
[14]	康宇晗, 时洋, 陈照云, 文梅. 面向迈创+MatrixZone异构系统的深度学习编程框架[J]. 计算机工程与科学, 2023, 45(07): 1149-1158.
[15]	刘浩翰, 孙铖, 贺怀清, 惠康华. 基于改进YOLOv3的金属表面缺陷检测[J]. 计算机工程与科学, 2023, 45(07): 1226-1235.

注：因版权方要求，不能公开全文，如需全文，请咨询杂志社

论文下载投稿咨询关注公众号

上一篇 : 基于残差注意力编-解码网络的道路提取方法（计算机工程与科学杂志论文）下一篇 : 融合交叉序列预测和一致性对比的WiFi人体活动识别

计算机工程与科学杂志

基于注意力指导的双粒度跨模态医学特征学习框架（计算机工程与科学杂志论文）

快速导航

服务项目

微信&公众号

计算机工程与科学杂志

基于注意力指导的双粒度跨模态医学特征学习框架（计算机工程与科学杂志论文）

快速导航

服务项目

微信&公众号

《计算机工程与科学》期刊官网