基于多目标多智能体强化学习的低轨卫星切换策略
DOI:
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TN927

基金项目:

国家自然科学基金(62061005)、广西自然科学基金(2022GXNSFBA035646)、广西创新驱动发展专项 (AA19254001) 资助


Low earth orbit satellite switching strategy based on multi-objective multi-agent reinforcement learning
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对低轨卫星通信系统(LSM) 中地面用户流量需求分布不均衡和用户并发切换过多等挑战,提出了一种基于多目标 多智能体协同深度强化学习的低轨卫星切换策略,以地面小区用户流量需求满意度、切换时延、用户冲突为优化目标,采用多 智能体协同深度学习算法对目标进行优化,其中每个智能体仅负责一个小区用户的卫星切换策略,智能体之间通过共享奖励 实现协作,从而达到多目标优化的效果。仿真结果表明,所提的切换策略的平均用户流量满意度为73.1%,平均切换时延为 343 ms,对比启发式算法能够更好满足地面小区用户的流量需求、平衡卫星网络的负载。

    Abstract:

    To address the challenges of uneven traffic demand distribution and excessive concurrent handover among ground users in low earth orbit satellite communication systems,this paper proposes a low earth orbit satellite handover strategy based on multi-objective multi-agent collaborative deep reinforcement learning.The strategy aims to optimize the ground cell user traffic demand satisfaction,handover delay,and user conflict as the objectives,and adopts a multi- agent collaborative deep learning algorithm to optimize the objectives.Each agent is only responsible for the satellite handover strategy of one cell user,and the agents cooperate with each other by sharing rewards,thus achieving the effect of multi-objective optimization. Simulation results show that the average user traffic satisfaction of the proposed handover strategy is 73.1%,and the average handover delay is 343 ms.Compared with heuristic algorithms,the proposed strategy can better meet the traffic demand of ground cell users and balance the satellite network's load.

    参考文献
    相似文献
    引证文献
引用本文

李 瑞,杨巧丽,张新澳.基于多目标多智能体强化学习的低轨卫星切换策略[J].国外电子测量技术,2024,43(3):106-113

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-06-12
  • 出版日期:
文章二维码
×
《国外电子测量技术》
财务封账不开票通知