基于生成对抗网络的车载语音增强应用
DOI:
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TN912

基金项目:


Vehicle voice enhancement application based on generative adversarial network
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    语音增强对智能车载系统和未来汽车工业的发展具有重要意义,为了解决汽车行驶过程中驾驶员语音被噪声污染的 问题,提出一种基于高效通道注意力机制的最小二乘生成对抗网络模型。首先在生成网络模型中引入注意力机制,自适应选 择一维卷积核大小生成通道权重,在降低模型复杂度的同时带来了明显的性能增益;然后利用最小二乘损失函数来代替Sig- moid交叉熵损失函数,使收敛速度更快,避免出现梯度消失的问题;最后经过生成对抗网络对抗博弈不断优化训练,从而实现 语音增强。实验表明,该方法相较基线方法在语音质量和清晰度方面都有良好的提升,语音质量感知评估(PESQ) 指标平均 提升了3.79%,短时客观可懂度(STOI) 指标平均提升了4.76%,因此更适合实际应用。

    Abstract:

    Voice enhancement is of great significance to the development of intlligent on-board system and the future automobile industry.In order to solve the problem of driver voice noise pollution in the process of car driving,a least squares generation adversarial network model based on the efficient channel attention mechanism is proposed.Firstly, the attention mechanism is introduced in the generative network model to automatically select the one-dimensional convolution kernel size to generate the channel weight,which brings obvious performance gain,and then uses the least squares loss function to replace the Sigmoid cross-entropy los function to make the convergence rate faster and avoid the problem of gradient disappearance.Finally,the speech enhancement is realized.Experiments show that the proposed method has a good improvement in both quality and clarity over the baseline method,The PESQ index average increased by 3.79%,the STOI index average increased by 4.76%,so it is more suitable for practical applications.

    参考文献
    相似文献
    引证文献
引用本文

石 瑞,杨立东,郭 勇,牛大伟,张丹丹.基于生成对抗网络的车载语音增强应用[J].国外电子测量技术,2023,42(2):151-156

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-10-16
  • 出版日期:
文章二维码