Papers

Random Access Control in NB-IoT with Model-Based Reinforcement Learning

In NB-IoT, the cell can be divided into up to three coverage enhancement (CE) levels, each associated with a narrowband Physical Random Access Channel (NPRACH) that has a CE level-specific configuration. Providing resources to NPRACHs increases the success rate of the random access procedure but detracts resources from the uplink carrier for other transmissions. To effectively address this trade-off we propose to adjust the NPRACH parameters along with the power thresholds that determine the CE levels, which allows to control at the same time the traffic distribution between CE levels and the resources allocated to each CE level. Since the traffic is dynamic and random, reinforcement learning (RL) is a suitable approach for finding an optimal control policy, but its inherent sample inefficiency is a drawback for online learning in an operational network. To overcome this issue, we propose a new model-based RL algorithm that achieves high efficiency even in the early stages of learning.

Random Access Control in NB-IoT with Model-Based Reinforcement Learning Read More »

Transmission power allocation in flow-guided nanocommunication networks

Flow-guided electromagnetic nanonetworks hold tremendous potential for transformative medical applications, enabling monitoring, information gathering, and data transmission within the human body. Operating in challenging environments with stringent computational and power constraints within human vascular systems, these nanonetworks face significant hurdles. Successful transmissions between in-body nanonodes and on-body nanorouters are infrequent, requiring novel approaches to enhance network throughput under such circumstances. Traditional flow-guided nanonetworks rely on nanonodes to transmit packets if they possess sufficient energy, irrespective of their proximity to the nanorouter. In this paper, we present an extended model for legacy flow-guided nanonetworks that offers substantial throughput improvements while reducing the required number of nanonodes compared to the baseline blind transmission approach. By allocating transmission energy to allow more than one transmission during a charging cycle, our proposed model significantly enhances network throughput, facilitating the deployment of nanocommunication-supported medical applications. For example, with only two transmissions, it is possible to increase throughput by around 46% with the same number of nanonodes or, equivalently, reduce the number of nanonodes by the same amount to achieve the same throughput.

Transmission power allocation in flow-guided nanocommunication networks Read More »

Dynamic transmission policy for enhancing LoRa network performance: A deep reinforcement learning approach

Long Range (LoRa) communications, operating through the LoRaWAN protocol, have received increasing attention from the low-power and wide-area network communities. Efficient energy consumption and reliable communication performance are critical aspects of LoRa-based applications. However, current scientific literature tends to focus on minimizing energy consumption while disregarding channel changes affecting communication performance. Other works attain appropriate communication performance without adequately considering energy expenditure. To fill this gap, we propose a novel solution to maximize the energy efficiency of devices while considering the desired network performance. This is done using a maximum allowed Bit Error Rate (BER) that can be specified by users and applications. We characterize this problem as a Markov Decision Process and solve it using Deep Reinforcement Learning to dynamically and quickly select the transmission parameters that jointly satisfy energy and performance requirements over time. Moreover, we support different payload sizes, ensuring suitability for applications with varying packet lengths. The proposed selection of parameters is evaluated in three different scenarios by comparing it with the traditional Adaptive Data Rate (ADR) mechanism of LoRaWAN. The first scenario involves static nodes with varying BER requirements. The second one realistically simulates urban environments with mobile nodes and fluctuating channel conditions. Finally, the third scenario studies the proposed solution under dynamic frame payload length variations. These scenarios cover a wide range of operational conditions to ensure a comprehensive evaluation. The results of our experiments demonstrate that our proposal achieves a 60% improvement in performance metrics over the default ADR mechanism.

Dynamic transmission policy for enhancing LoRa network performance: A deep reinforcement learning approach Read More »

Transmission Control in NB-IoT With Model-Based Reinforcement Learning

In Narrowband Internet of Things (NB-IoT), the control of uplink transmissions is a complex task involving device scheduling, resource allocation in the carrier, and the configuration of link-adaptation parameters. Existing heuristic proposals partially address the problem, but reinforcement learning (RL) seems to be the most effective approach a priori, given its success in similar control problems. However, the low sample efficiency of conventional (model-free) RL algorithms is an important limitation for their deployment in real systems. During their initial learning stages, RL agents need to explore the policy space selecting actions that are, in general, highly ineffective. In an NB-IoT access network this implies a disproportionate increase in transmission delays. In this paper, we make two contributions to enable the adoption of RL in NB-IoT: first, we present a multi-agent architecture based on the principle of task division. Second, we propose a new model-based RL algorithm for link adaptation characterized by its high sample efficiency. The combination of these two strategies results in an algorithm that, during the learning phase, is able to maintain the transmission delay in the order of hundreds of milliseconds, whereas model-free RL algorithms cause delays of up to several seconds. This allows our approach to be deployed, without prior training, in an operating NB-IoT network and learn to control it efficiently without degrading its performance.

Transmission Control in NB-IoT With Model-Based Reinforcement Learning Read More »

Dynamic Multihop Routing in Terahertz Flow-Guided Nanosensor Networks: A Reinforcement Learning Approach

The Internet of Nano-Things (IoNT) is an emerg-ing paradigm in which devices sized to the nanoscale (nanon-odes) and transmitting in the terahertz (THz) band canbecome decisive actors in future medical applications. Flow-guided nanonetworks are well-known THz networks aimedat deploying the IoNT inside the human body, among otherissues. In these networks, nanonodes flowing through thebloodstream monitor-sensitive biological/physical parame-ters and dispatch these data via electromagnetic (EM) wavesto a nanorouter implanted in human tissue, which operatesas a gateway to external Internet connectivity devices. Underthese premises, two shortcomings arise. First, the use ofthe THz band greatly limits the nanonode’s communicationrange. Second, the nanonodes lack resources for processing, memory, and batteries. To minimize the impact of theseconcerns in EM nanocommunications, a novel dynamic multihop routing scheme is proposed to model in-body, flow-guided nanonetwork architecture. To this end, a reinforcement learning-based framework is conceived, combining thefeatures of EM nanocommunications and hemodynamics or fluid dynamics applied to the bloodstream. A generic Markovdecision process (MDP) approach is derived to maximize the throughput metric, analytically modeling: 1) the movementof the nanonodes in the bloodstream as laminar flow; 2) energy consumption (including energy-harvesting issues); and3) prioritized events. A thoroughly THz flow-guided nanonetwork case of study is also defined. Under the umbrella of thiscase, diverse testbeds are planned to create a procedure of evaluation, validation, and discussion. Results reveal thatmultihop scenarios obtain better performance than direct nanonode-nanorouter communication, specifically, the two-hopscenario, which, for instance, quadrupled the throughput in a hand vein without sharply penalizing other aspects suchas energy consumption.

Dynamic Multihop Routing in Terahertz Flow-Guided Nanosensor Networks: A Reinforcement Learning Approach Read More »

Bridging Nano- and Body Area Networks: A Full Architecture for Cardiovascular Health Applications

Cardiovascular events occurring in the bloodstream are responsible for about 40% of human deaths in developed countries. Motivated by this fact, we present a new global network architecture for a system for the diagnosis and treatment of cardiovascular events, focusing on problems related to pulmonary artery occlusion, i.e., situations of artery blockage by a blood clot. The proposed system is based on bio-sensors for detection of artery blockage and bio-actuators for releasing appropriate medicines, both types of devices being implanted in pulmonary arteries. The system can be used by a person leading an active life and provides bidirectional communication with medical personnel via nano-nodes circulating in the bloodstream constituting an in-body area network. We derive an analytical model for calculating the required number of nano-nodes to detect artery blockage and the probability of activating a bio-actuator. We also analyze the performance of the body area component of the system in terms of path loss and of wireless links budget. Results show that the system can diagnose a blocked artery in about 3 hours and that after another 3 hours medicines can be released in the exact spot of the artery occlusion, while with current medical practices the average time for diagnosis varies between 5 to 9 days.

Bridging Nano- and Body Area Networks: A Full Architecture for Cardiovascular Health Applications Read More »

Model-Based Reinforcement Learning with Kernels for Resource Allocation in RAN Slices

This paper addresses the dynamic allocation of RAN resources among network slices, aiming at maximizing resource efficiency while assuring the fulfillment of the service level agreements (SLAs) for each slice. It is a challenging stochastic control problem, since slices are characterized by multiple random variables and several objectives must be managed in parallel. Moreover, coexisting slices can have different descriptors and behaviors according to their type of service (e.g., enhanced mobile broadband, eMBB, or massive machine type communication, mMTC). Most of the existing proposals for this problem use a model-free RL (MFRL) strategy. The main drawback of MFRL algorithms is their low sample efficiency which, in an online learning scenario (i.e., when the agents learn on an operating network), may lead to long periods of resource over-provisioning and frequent SLA violations. To overcome this limitation, we follow a model-based RL (MBRL) approach built upon a novel modeling strategy that comprises a kernel-based classifier and a self-assessment mechanism. In numerical experiments, our proposal, referred to as kernel-based RL (KBRL), clearly outperforms state-of-the-art RL algorithms in terms of SLA fulfillment, resource efficiency, and computational overhead.

Model-Based Reinforcement Learning with Kernels for Resource Allocation in RAN Slices Read More »

Multi-Agent Deep Reinforcement Learning to Manage Connected Autonomous Vehicles at Tomorrow’s Intersections

In recent years, the growing development of Connected Autonomous Vehicles (CAV), Intelligent Transport Systems (ITS), and 5G communication networks have led to the advent of Autonomous Intersection Management (AIM) systems. AIMs present a new paradigm for CAV control in future cities, taking control of CAVs in scenarios where cooperation is necessary and allowing safe and efficient traffic flows, eliminating traffic signals. So far, the development of AIM algorithms has been based on basic control algorithms, without the ability to adapt or keep learning new situations. To solve this, in this paper we present a new advanced AIM approach based on end-to-end Multi-Agent Deep Reinforcement Learning (MADRL) and trained using Curriculum through Self-Play , called advanced Reinforced AIM ( adv. RAIM). adv. RAIM enables the control of CAVs at intersections in a collaborative way, autonomously learning complex real-life traffic dynamics. In addition, adv .RAIM provides a new way to build smarter AIMs capable of proactively controlling CAVs in other highly complex scenarios. Results show remarkable improvements when compared to traffic light control techniques (reducing travel time by 59% or reducing time lost due to congestion by 95%), as well as outperforming other recently proposed AIMs (reducing waiting time by 56%), highlighting the advantages of using MADRL.

Multi-Agent Deep Reinforcement Learning to Manage Connected Autonomous Vehicles at Tomorrow’s Intersections Read More »