SAC with State Abstraction Randomized Scaled Offloading Strategy for Edge Computational Offloading

XIAN Xuehui; DONG Ze

doi:10.13976/j.cnki.xk.2024.0344

XIAN Xuehui, DONG Ze. SAC with State Abstraction Randomized Scaled Offloading Strategy for Edge Computational OffloadingJ. INFORMATION AND CONTROL. DOI: 10.13976/j.cnki.xk.2024.0344

Citation:

XIAN Xuehui, DONG Ze. SAC with State Abstraction Randomized Scaled Offloading Strategy for Edge Computational OffloadingJ. INFORMATION AND CONTROL. DOI: 10.13976/j.cnki.xk.2024.0344

Citation:

XIAN Xuehui, DONG Ze. SAC with State Abstraction Randomized Scaled Offloading Strategy for Edge Computational OffloadingJ. INFORMATION AND CONTROL. DOI: 10.13976/j.cnki.xk.2024.0344

SAC with State Abstraction Randomized Scaled Offloading Strategy for Edge Computational Offloading

Graphical Abstract

Graphical Abstract

Abstract

Abstract

To enhance the performance of the mobile edge computing offloading algorithm in terms of computational delay, computational energy consumption, and algorithmic adaptability, a soft actor critic with state abstraction (SACSA) scaled offloading strategy for edge computing is proposed. We have constructed the task model and the communication model within the context of an end-edge cooperative architecture. The offloading model is improved, and a novel action function for proportional offloading is designed. This model incorporates a negative punishment mechanism to optimize the reward function, effectively enhancing the model's representational ability for the real environment. It provides a more reasonable target orientation for deep reinforcement learning algorithms to solve for the optimal strategy through the cumulative reward mechanism. Specifically, to address the challenge of sparse reward in the offloading model with the soft actor critic algorithm, we have devised a method to compute intrinsic rewards with state abstraction. Through simulation experiments, the effectiveness of the action function for proportional offloading and the reward function with a negative punishment mechanism has been verified. A comparative performance analysis against SAC, Neural Episodic Control with State Abstraction (NECSA), Proximal Policy Optimization (PPO) and Twin Delayed Deep Deterministic Policy Gradient (TD3) demonstrates that SACSA outperforms these algorithms significantly. Across diverse operational scenarios, SACSA achieves notable improvements: 1.64%~85.35% reduction in task latency, 0.55%~69.64% enhancement in task completion rate, and 0.53%~75.8% enhancement in episode reward.

FullText(HTML)

References (37)

Cited By

SAC with State Abstraction Randomized Scaled Offloading Strategy for Edge Computational Offloading

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content