Reinforcement Learning-Based Data Rate Congestion Control for Vehicular Ad-Hoc Networks

Gnana Shilpa Nuthalapati, University of Windsor


Vehicular Ad-Hoc Network(VANET) is an emerging wireless technology vital to the Intelligent Transportation System(ITS) for vehicle-to-vehicle and vehicle-to-infrastructure communication. An ITS is an advanced solution that aims to deliver innovative services pertaining to various transportation modes and traffic management. Its objective is to enhance user awareness, promote safety, and enable more efficient and coordinated utilization of transport networks. ITS aims to mitigate traffic problems and improve the safety of transport by preventing unexpected events. When the vehicle density, i.e., the number of vehicles communicating in a wireless channel, increases, the channel faces congestion resulting in unreliable safety applications. Various decentralized congestion control algorithms have been proposed to effectively decrease channel congestion by controlling transmission parameters such as message rate, transmission power, and data rate. This thesis proposes a data rate-based congestion control technique using the Q-Learning algorithm to maintain the channel load below the target threshold. The congestion problem is formulated as an MDP and solved using a Q-learning algorithm. Q-learning is a model-free Reinforcement Learning algorithm that learns the values of an action within a specific state without relying on an explicit model of the environment. Reinforcement Learning has a set of states and actions and will find the best action for each state. The target is to train the vehicle to select the most appropriate data rate to send out a Basic Safety Message(BSM) by maintaining the channel load below the target threshold value. We use the Q-Learning algorithm with data obtained from a simulated dynamic traffic environment. We define a reward function combining CBR and data rate to maintain the channel load below the target threshold with the least data rate possible. Simulation results show that the proposed algorithm performs better over other techniques such as Transmit Data rate Control(TDRC), Data Rate based Decentralized Congestion Control(DR-DCC) and Data Rate Control Algorithm (DRCA) in low and medium loads and better over TDRC and DR-DCC in heavy load in terms of the Channel Busy Ratio (CBR), packet loss and Beacon Error Rate (BER).