TANG Rui, HE Zuhan, ZHANG Ruizhi, YUE Shibo, PANG Chuanlin, HE Jinpu. Hybrid Offline-online Resource Allocation Mechanism for D2D-NOMA Systems[J]. INFORMATION AND CONTROL, 2023, 52(5): 574-587. DOI: 10.13976/j.cnki.xk.2023.2307
Citation: TANG Rui, HE Zuhan, ZHANG Ruizhi, YUE Shibo, PANG Chuanlin, HE Jinpu. Hybrid Offline-online Resource Allocation Mechanism for D2D-NOMA Systems[J]. INFORMATION AND CONTROL, 2023, 52(5): 574-587. DOI: 10.13976/j.cnki.xk.2023.2307

Hybrid Offline-online Resource Allocation Mechanism for D2D-NOMA Systems

More Information
  • Received Date: July 06, 2022
  • Revised Date: November 15, 2022
  • Accepted Date: November 14, 2022
  • Available Online: October 22, 2023
  • A device-to-device (D2D) communication-empowered nonorthogonal multiple access system is associated with complex co-channel interference. In this study, we optimize both mode selection and power control in order to maximize the sum proportional bit rate to balance spectral efficiency and user fairness. Accordingly, we propose a hybrid offline-online mechanism to cope with the original mixed-integer non-convex optimization problem. In offline training, variable transformation is used to equivalently transform the power control subproblem into a convex optimization problem. Its global optimum can be readily obtained in milliseconds by using the sophisticated convex optimization toolbox. According to the obtained optimization results, the deep Q-learning algorithm is then applied to build up the mapping relationship from the mode selection scheme and channel state information to the optimal mode adjustment policy. The trained resource allocation mechanism is suitable for online implementation as it involves only simple algebraic operations and a single convex optimization problem. The simulation results show that the proposed mechanism strikes a good balance between performance and operation time. Particularly, it cuts down the average operation time by 94.54% while suffering approximately 10% performance loss compared with the global optimum obtained by the exhausting search.

  • [1]
    董园园, 巩彩红, 李华, 等. 面向6G的非正交多址接入关键技术[J]. 移动通信, 2020, 44(6): 57-62. https://www.cnki.com.cn/Article/CJFDTOTAL-YDTX202011004.htm

    DONG Y Y, GONG C H, LI H, et al. The key technologies of non-orthogonal multiple access for 6G systems[J]. Mobile Communications, 2020, 44(6): 57-62. https://www.cnki.com.cn/Article/CJFDTOTAL-YDTX202011004.htm
    [2]
    钱志鸿, 王雪. 面向5G通信网的D2D技术综述[J]. 通信学报, 2016, 37(7): 1-14. https://www.cnki.com.cn/Article/CJFDTOTAL-TXXB201607001.htm

    QIAN Z H, WANG X. Reviews of D2D technology for 5G communication networks[J]. Journal on Communications, 2016, 37(7): 1-14. https://www.cnki.com.cn/Article/CJFDTOTAL-TXXB201607001.htm
    [3]
    NAJMEH M, SHABNAM S. Performance analysis of non-orthogonal multiple access with underlaid device-to-device communications[J]. IEEE Access, 2018, 6: 39820-39826. doi: 10.1109/ACCESS.2018.2855753
    [4]
    PAN Y, PAN C, YANG Z, et al. Resource allocation for D2D communications underlaying a NOMA-based cellular network[J]. IEEE Wireless Communication Letters, 2018, 7(1): 130-133. doi: 10.1109/LWC.2017.2759114
    [5]
    ZHENG H, HOU S, LI H, et al. Power allocation and user clustering for uplink MC-NOMA in D2D underlaid cellular networks[J]. IEEE Wireless Communications Letters, 2018, 7(6): 1030-1033. doi: 10.1109/LWC.2018.2845398
    [6]
    KAZMI S M A, TRAN N H, HO T M, et al. Coordinated device-to-device communication with non-orthogonal multiple access in future wireless cellular networks[J]. IEEE Access, 2018, 6: 39860-39875. doi: 10.1109/ACCESS.2018.2850924
    [7]
    DAI Y, SHENG M, LIU J, et al. Joint mode selection and resource allocation for D2D-enabled NOMA cellular networks[J]. IEEE Transactions on Vehicular Technology, 2019, 68(7): 6721-6733. doi: 10.1109/TVT.2019.2916395
    [8]
    ZHAI D, ZHANG R, WANG Y, et al. Joint user pairing, mode selection, and power control for D2D-capable cellular networks enhanced by nonorthogonal multiple access[J]. IEEE Internet of Things Journal, 2019, 6(5): 8919-8932. doi: 10.1109/JIOT.2019.2924513
    [9]
    BI Z, ZHOU W. Deep reinforcement learning based power allocation for D2D network[C/OL]//2020 IEEE 91st Vehicular Technology Conference. Piscataway, USA: IEEE, 2020[2022-11-10]. https://ieeexplore.ieee.org/decument/9129537. DOI: 10.1109/VTC2020-Spring48590.2020.9129537.
    [10]
    JI Z, KIANI A K, QIN Z, et al. Power optimization in device-to-device communications: A deep reinforcement learning approach with dynamic reward[J]. IEEE Wireless Communications Letters, 2020, 10(3): 508-511.
    [11]
    TANG R, ZHANG R Z, XIA Y M, et al. Joint mode selection and power allocation for NOMA systems with D2D communication[C]//IEEE/CIC International Conference on Communications in China. Piscataway, USA: IEEE, 2021: 606-611.
    [12]
    ZHI Y, TIAN J, DENG X F, et al. Deep reinforcement learning-based resource allocation for D2D communications in heterogeneous cellular networks[J/OL]. Digital Communications and Networks, 2021[2021-11-30]. http://www.sciencedirect.com/science/article/pii/s2352864821000730. DOI: 10.1016/j.dcan.2021.09.013.
    [13]
    CHIANG M, HANDE P, LAN T, et al. Power control in wireless cellular networks[J]. Foundations & Trends in Networking, 2008, 2(4): 381-533.
    [14]
    FRANCOIS-LAVET V, HENDERSON P, ISLAM R, et al. An introduction to deep reinforcement learning[J]. Foundations & Trends in Machine Learning, 2018, 11(3/4): 219-354.
    [15]
    MARAQA O, RAJASEKARAN A S, AL-AHMADI S, et al. A survey of rate-optimal power domain NOMA with enabling technologies of future wireless networks[J]. IEEE Communications Surveys & Tutorials, 2020, 22(4): 2192-2235.
    [16]
    LUO Z Q, ZHANG S. Dynamic Spectrum Management: Complexity and Duality[J]. IEEE Journal of Selected Topics in Signal Processing, 2008, 2(1): 57-73. doi: 10.1109/JSTSP.2007.914876
    [17]
    黄玉蕾, 唐睿, 罗晓霞, 等. LTE-A蜂窝网络下设备直通中的联合信道分配和功率控制方案[J]. 信息与控制, 2017, 46(2): 231-237, 256. doi: 10.13976/j.cnki.xk.2017.0231

    HUANG Y L, TANG R, LUO X X, et al. Joint channel assignment and power control for D2D communication underlaying LTE-A cellular networks[J]. Information and Control, 2017, 46(2): 231-237, 256. doi: 10.13976/j.cnki.xk.2017.0231
    [18]
    BOYD S, VANDENBERGHE L. Convex optimization[M]. Cambrideg, USA: Cambridge University Press, 2004.
    [19]
    邱锡鹏. 神经网络与深度学习[M]. 北京: 机械工业出版社, 2020.

    QIU X P. Neural networks and deep learning[M]. Beijing: China Machine Press, 2020.
    [20]
    TANG R, ZHANG R Z, ZHAO Y H, et al. Power allocation for NOMA-based two-way full-duplex relaying systems[C]//IEEE/CIC International Conference on Communications in China. Piscataway, USA: IEEE, 2021: 1077-1082.
    [21]
    WATKINS C, DAYAN P. Q-learning[J]. Machine Learning, 1992, 8(3/4): 279-292. doi: 10.1023/A:1022676722315
    [22]
    SUTTON R, BARTO A. Reinforcement learning: An introduction[M]. Cambridge, MA: MIT Press, 1998.
    [23]
    MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing Atari with deep reinforcement learning[C/OL]//NIPS Deep Learning Workshop, 2013. (2013-12-09)[2022-05-23]. https://arxiv.org/pdf/1312.5602.pdf.
    [24]
    HASSELT H V, GUEZ A, SILVER D. Deep reinforcement learning with double Q-Learning[C]//AAAI Conference on Artificial Intelligence. Keystone, USA: AAAI, 2016: 2094-2100.
    [25]
    VOLODYMYR M, KORAY K, DAVID S, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529-33. doi: 10.1038/nature14236
    [26]
    LI S, XU X, ZUO L. Dynamic path planning of a mobile robot with improved Q-learning algorithm[C]//2015 IEEE International Conference on Information and Automation. Piscataway, USA: IEEE, 2015: 409-414.
    [27]
    LE Q V, NGIAM J, COATES A, et al. On optimization methods for deep learning[C]//28th International Conference on Machine Learning. Madison, USA: Omni Press, 2011: 265-272.
    [28]
    KINGMA D, BA J. Adam: A method for stochastic optimization[C/OL]//2015 3rd International Conference for Learning Representations. San Diego, USA: Brown Walker Press, 2015[2022-03-09]. https://www.oabil.com/paper/4068193.
    [29]
    O'SHEA T, HOYDIS J. An introduction to deep learning for the physical layer[J]. IEEE Transactions on Cognitive Communications and Networking, 2017, 3(4): 563-575. https://ieeexplore.ieee.org/document/8054694
    [30]
    HERBERT S, WASSELL I, LOH T H, et al. Characterizing the spectral properties and time variation of the in-vehicle wireless communication channel[J]. IEEE Transactions on Communications, 2014, 62(7): 2390-2399.
  • Related Articles

    [1]WANG Rui, XU Xinchao, LU Jing. Short-term Wind Power Prediction Based on SSA Optimized Variational Mode Decomposition and Hybrid Kernel Extreme Learning Machine[J]. INFORMATION AND CONTROL, 2023, 52(4): 444-454. DOI: 10.13976/j.cnki.xk.2023.2281
    [2]MA Lizhi, TANG Rui, ZHANG Ruizhi, HE Jinpu. Design of Resource Allocation Mechanisms for Wireless Power Transfer-based Internet-of-things Data Collection System[J]. INFORMATION AND CONTROL, 2023, 52(2): 220-234. DOI: 10.13976/j.cnki.xk.2023.2034
    [3]ZHAO Yue, SHEN Yanxia. Maximum Power Point Tracking of Wireless Power Transmission System Based on Improved Particle Swarm Optimization[J]. INFORMATION AND CONTROL, 2021, 50(1): 113-118, 128. DOI: 10.13976/j.cnki.xk.2020.0258
    [4]WU Dinghui, ZHANG Xiaolin, SHEN Yanxia. Wind Turbine LPV Active Fault-tolerant Control Base on the Double Convex Polyhedron[J]. INFORMATION AND CONTROL, 2017, 46(6): 646-652. DOI: 10.13976/j.cnki.xk.2017.0646
    [5]HUANG Yulei, TANG Rui, LUO Xiaoxia, LIU Duren, JIAO Chun. Joint Channel Assignment and Power Control Scheme for D2D Communication Underlying LTE-A Cellular Networks[J]. INFORMATION AND CONTROL, 2017, 46(2): 231-237, 256. DOI: 10.13976/j.cnki.xk.2017.0231
    [6]PENG Xi, HE Yong, WU Min, LIANG Yongchao. Dynamic Reactive-Power Voltage Optimal Control Strategy Based on Zoning[J]. INFORMATION AND CONTROL, 2014, 43(1): 88-95. DOI: 10.3724/SP.J.1219.2014.00088
    [7]TONG Chengyi, LIU Zhaohua. An Optimization Control Strategy for Parallel Active Power FilterBased on Active Disturbance Rejection[J]. INFORMATION AND CONTROL, 2012, 41(6): 707-712,719. DOI: 10.3724/SP.J.1219.2012.00707
    [8]ZHANG Kai, QIAN Huanyan. Optimal Power Allocation for Wireless Multicast Networks with Non-regenerative Relaying Based on Cooperative Communication Technology[J]. INFORMATION AND CONTROL, 2011, 40(3): 323-330.
    [9]ZHANG Gang, WANG Zhi-quan, YANG Yun-lin. Optimal Guaranteed Cost Fault-tolerant Control for a Class of Interconnected Time-delay Systems[J]. INFORMATION AND CONTROL, 2005, 34(6): 669-675.
    [10]LI Zheng-guo, LUO An, CHEN Rui-nuo. AN INTEGRATED INTELLIGENT CONTROL RESEARCH APPLICATION OF HIGH-POWER ELECTRIC ARC FURNACE SYSTEM[J]. INFORMATION AND CONTROL, 2003, 32(4): 309-313,317.
  • Cited by

    Periodical cited type(2)

    1. 唐睿,岳士博,张睿智,刘川,庞川林. UAV协助下非正交多址接入使能的数据采集系统中能效优化机制. 计算机应用. 2024(04): 1209-1218 .
    2. 庞川林,唐睿,张睿智,刘川,刘佳,岳士博. D2D通信系统中基于图卷积网络的分布式功率控制算法. 计算机应用. 2024(09): 2855-2862 .

    Other cited types(1)

Catalog

    Article views (88) PDF downloads (46) Cited by(3)
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return