基于深度强化学习的无人机辅助物联网多目标优化

基于深度强化学习的无人机辅助物联网多目标优化
DOI:
                        
                    
CSTR:
                        [cstr]
                    
作者:
                        
                        
                    
作者单位:1.南京信息工程大学;2.无锡学院;3.北京邮电大学网络与交换技术国家重点实验室
作者简介:
通讯作者:
中图分类号:TN929.5
基金项目:网络与交换技术全国重点实验室（北京邮电大学）开放课题资助项目（SKLNST-2023-1-13）

Multi-Objective Optimization of Unmanned Aerial Vehicle Assisted Internet of Things Based on Deep Reinforcement Learning

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

无人机辅助无线供电物联网是一种创新的网络架构,利用无人机作为能量传输中介,能够解决物联网设备电力供应的限制和局限性。针对无人机辅助无线供电物联网网络中多目标控制策略学习的问题,提出了一种基于深度强化学习的多目标双延迟深度确定性策略梯度(MOTD3)算法,旨在满足偏航角、飞行速度以及发射功率约束条件下,实现总数据速率、总收获能量最大化以及能耗和悬停时间最小化的多目标联合优化,同时因需求动态变化无人机进行在线路径规划。仿真结果表明,该算法在多目标优化性能方面优于深度确定性策略梯度(DDPG)算法、优势演员评论家算法(A2C)和其他控制策略,在收敛情况和稳定性方面也有较好表现,且具有较强泛化能力,可适用于实际中不同通信场景。

Abstract:

The unmanned aerial vehicle (UAV)-assisted wireless power supply for the Internet of Things (IoT) is an innovative network architecture where UAVs serve as energy transmission intermediaries, effectively addressing the limitations and constraints of power supply for IoT devices. In addressing the challenge of multi-objective control policy learning in UAV-assisted wireless power supply for the IoT, this study proposes a Multi-Objective Twin-Delay Deep Deterministic Policy Gradient (MOTD3) algorithm based on deep reinforcement learning. The MOTD3 algorithm aims to achieve joint optimization of multiple objectives, including maximizing the total data rate and total harvested energy, while minimizing energy consumption and hover time, under constraints such as yaw angle, flight speed, and transmission power. Additionally, it adapts UAVs to dynamic demand changes through online path planning. Simulation results demonstrate that the proposed algorithm outperforms the Deep Deterministic Policy Gradient (DDPG) algorithm, the Advantage Actor-Critic algorithm (A2C) and other control strategies in terms of multi-objective optimization performance, convergence, and stability. Moreover, it exhibits strong generalization capabilities, making it suitable for various communication scenarios in practical applications.

参考文献

相似文献

引证文献

引用本文

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2023-12-26
最后修改日期:2024-03-13
录用日期:2024-03-13
在线发布日期:
出版日期:

网站首页

杂志简介

在线阅读

投稿须知

欢迎订阅

联系我们

引用本文

相关视频

分享

文章指标

历史

文章二维码