Citation: | TANG Rui, HE Zuhan, ZHANG Ruizhi, YUE Shibo, PANG Chuanlin, HE Jinpu. Hybrid Offline-online Resource Allocation Mechanism for D2D-NOMA Systems[J]. INFORMATION AND CONTROL, 2023, 52(5): 574-587. DOI: 10.13976/j.cnki.xk.2023.2307 |
A device-to-device (D2D) communication-empowered nonorthogonal multiple access system is associated with complex co-channel interference. In this study, we optimize both mode selection and power control in order to maximize the sum proportional bit rate to balance spectral efficiency and user fairness. Accordingly, we propose a hybrid offline-online mechanism to cope with the original mixed-integer non-convex optimization problem. In offline training, variable transformation is used to equivalently transform the power control subproblem into a convex optimization problem. Its global optimum can be readily obtained in milliseconds by using the sophisticated convex optimization toolbox. According to the obtained optimization results, the deep Q-learning algorithm is then applied to build up the mapping relationship from the mode selection scheme and channel state information to the optimal mode adjustment policy. The trained resource allocation mechanism is suitable for online implementation as it involves only simple algebraic operations and a single convex optimization problem. The simulation results show that the proposed mechanism strikes a good balance between performance and operation time. Particularly, it cuts down the average operation time by 94.54% while suffering approximately 10% performance loss compared with the global optimum obtained by the exhausting search.
[1] |
董园园, 巩彩红, 李华, 等. 面向6G的非正交多址接入关键技术[J]. 移动通信, 2020, 44(6): 57-62. https://www.cnki.com.cn/Article/CJFDTOTAL-YDTX202011004.htm
DONG Y Y, GONG C H, LI H, et al. The key technologies of non-orthogonal multiple access for 6G systems[J]. Mobile Communications, 2020, 44(6): 57-62. https://www.cnki.com.cn/Article/CJFDTOTAL-YDTX202011004.htm
|
[2] |
钱志鸿, 王雪. 面向5G通信网的D2D技术综述[J]. 通信学报, 2016, 37(7): 1-14. https://www.cnki.com.cn/Article/CJFDTOTAL-TXXB201607001.htm
QIAN Z H, WANG X. Reviews of D2D technology for 5G communication networks[J]. Journal on Communications, 2016, 37(7): 1-14. https://www.cnki.com.cn/Article/CJFDTOTAL-TXXB201607001.htm
|
[3] |
NAJMEH M, SHABNAM S. Performance analysis of non-orthogonal multiple access with underlaid device-to-device communications[J]. IEEE Access, 2018, 6: 39820-39826. doi: 10.1109/ACCESS.2018.2855753
|
[4] |
PAN Y, PAN C, YANG Z, et al. Resource allocation for D2D communications underlaying a NOMA-based cellular network[J]. IEEE Wireless Communication Letters, 2018, 7(1): 130-133. doi: 10.1109/LWC.2017.2759114
|
[5] |
ZHENG H, HOU S, LI H, et al. Power allocation and user clustering for uplink MC-NOMA in D2D underlaid cellular networks[J]. IEEE Wireless Communications Letters, 2018, 7(6): 1030-1033. doi: 10.1109/LWC.2018.2845398
|
[6] |
KAZMI S M A, TRAN N H, HO T M, et al. Coordinated device-to-device communication with non-orthogonal multiple access in future wireless cellular networks[J]. IEEE Access, 2018, 6: 39860-39875. doi: 10.1109/ACCESS.2018.2850924
|
[7] |
DAI Y, SHENG M, LIU J, et al. Joint mode selection and resource allocation for D2D-enabled NOMA cellular networks[J]. IEEE Transactions on Vehicular Technology, 2019, 68(7): 6721-6733. doi: 10.1109/TVT.2019.2916395
|
[8] |
ZHAI D, ZHANG R, WANG Y, et al. Joint user pairing, mode selection, and power control for D2D-capable cellular networks enhanced by nonorthogonal multiple access[J]. IEEE Internet of Things Journal, 2019, 6(5): 8919-8932. doi: 10.1109/JIOT.2019.2924513
|
[9] |
BI Z, ZHOU W. Deep reinforcement learning based power allocation for D2D network[C/OL]//2020 IEEE 91st Vehicular Technology Conference. Piscataway, USA: IEEE, 2020[2022-11-10]. https://ieeexplore.ieee.org/decument/9129537. DOI: 10.1109/VTC2020-Spring48590.2020.9129537.
|
[10] |
JI Z, KIANI A K, QIN Z, et al. Power optimization in device-to-device communications: A deep reinforcement learning approach with dynamic reward[J]. IEEE Wireless Communications Letters, 2020, 10(3): 508-511.
|
[11] |
TANG R, ZHANG R Z, XIA Y M, et al. Joint mode selection and power allocation for NOMA systems with D2D communication[C]//IEEE/CIC International Conference on Communications in China. Piscataway, USA: IEEE, 2021: 606-611.
|
[12] |
ZHI Y, TIAN J, DENG X F, et al. Deep reinforcement learning-based resource allocation for D2D communications in heterogeneous cellular networks[J/OL]. Digital Communications and Networks, 2021[2021-11-30]. http://www.sciencedirect.com/science/article/pii/s2352864821000730. DOI: 10.1016/j.dcan.2021.09.013.
|
[13] |
CHIANG M, HANDE P, LAN T, et al. Power control in wireless cellular networks[J]. Foundations & Trends in Networking, 2008, 2(4): 381-533.
|
[14] |
FRANCOIS-LAVET V, HENDERSON P, ISLAM R, et al. An introduction to deep reinforcement learning[J]. Foundations & Trends in Machine Learning, 2018, 11(3/4): 219-354.
|
[15] |
MARAQA O, RAJASEKARAN A S, AL-AHMADI S, et al. A survey of rate-optimal power domain NOMA with enabling technologies of future wireless networks[J]. IEEE Communications Surveys & Tutorials, 2020, 22(4): 2192-2235.
|
[16] |
LUO Z Q, ZHANG S. Dynamic Spectrum Management: Complexity and Duality[J]. IEEE Journal of Selected Topics in Signal Processing, 2008, 2(1): 57-73. doi: 10.1109/JSTSP.2007.914876
|
[17] |
黄玉蕾, 唐睿, 罗晓霞, 等. LTE-A蜂窝网络下设备直通中的联合信道分配和功率控制方案[J]. 信息与控制, 2017, 46(2): 231-237, 256. doi: 10.13976/j.cnki.xk.2017.0231
HUANG Y L, TANG R, LUO X X, et al. Joint channel assignment and power control for D2D communication underlaying LTE-A cellular networks[J]. Information and Control, 2017, 46(2): 231-237, 256. doi: 10.13976/j.cnki.xk.2017.0231
|
[18] |
BOYD S, VANDENBERGHE L. Convex optimization[M]. Cambrideg, USA: Cambridge University Press, 2004.
|
[19] |
邱锡鹏. 神经网络与深度学习[M]. 北京: 机械工业出版社, 2020.
QIU X P. Neural networks and deep learning[M]. Beijing: China Machine Press, 2020.
|
[20] |
TANG R, ZHANG R Z, ZHAO Y H, et al. Power allocation for NOMA-based two-way full-duplex relaying systems[C]//IEEE/CIC International Conference on Communications in China. Piscataway, USA: IEEE, 2021: 1077-1082.
|
[21] |
WATKINS C, DAYAN P. Q-learning[J]. Machine Learning, 1992, 8(3/4): 279-292. doi: 10.1023/A:1022676722315
|
[22] |
SUTTON R, BARTO A. Reinforcement learning: An introduction[M]. Cambridge, MA: MIT Press, 1998.
|
[23] |
MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing Atari with deep reinforcement learning[C/OL]//NIPS Deep Learning Workshop, 2013. (2013-12-09)[2022-05-23]. https://arxiv.org/pdf/1312.5602.pdf.
|
[24] |
HASSELT H V, GUEZ A, SILVER D. Deep reinforcement learning with double Q-Learning[C]//AAAI Conference on Artificial Intelligence. Keystone, USA: AAAI, 2016: 2094-2100.
|
[25] |
VOLODYMYR M, KORAY K, DAVID S, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529-33. doi: 10.1038/nature14236
|
[26] |
LI S, XU X, ZUO L. Dynamic path planning of a mobile robot with improved Q-learning algorithm[C]//2015 IEEE International Conference on Information and Automation. Piscataway, USA: IEEE, 2015: 409-414.
|
[27] |
LE Q V, NGIAM J, COATES A, et al. On optimization methods for deep learning[C]//28th International Conference on Machine Learning. Madison, USA: Omni Press, 2011: 265-272.
|
[28] |
KINGMA D, BA J. Adam: A method for stochastic optimization[C/OL]//2015 3rd International Conference for Learning Representations. San Diego, USA: Brown Walker Press, 2015[2022-03-09]. https://www.oabil.com/paper/4068193.
|
[29] |
O'SHEA T, HOYDIS J. An introduction to deep learning for the physical layer[J]. IEEE Transactions on Cognitive Communications and Networking, 2017, 3(4): 563-575. https://ieeexplore.ieee.org/document/8054694
|
[30] |
HERBERT S, WASSELL I, LOH T H, et al. Characterizing the spectral properties and time variation of the in-vehicle wireless communication channel[J]. IEEE Transactions on Communications, 2014, 62(7): 2390-2399.
|
1. |
唐睿,岳士博,张睿智,刘川,庞川林. UAV协助下非正交多址接入使能的数据采集系统中能效优化机制. 计算机应用. 2024(04): 1209-1218 .
![]() | |
2. |
庞川林,唐睿,张睿智,刘川,刘佳,岳士博. D2D通信系统中基于图卷积网络的分布式功率控制算法. 计算机应用. 2024(09): 2855-2862 .
![]() |