10h30 - 10h55
Renewal theory based reinforcement learning for Markov processes with controlled restarts
Markov processes with controlled restarts arise in networked control systems.
Under a threshold based strategy, such processes are regenerative. Therefore,
the optimal performance can be written in terms of the performance
during a regenerative cycle. We exploit this relationship to develop a
sample-path based policy gradient algorithm.
10h55 - 11h20
ANNULE / A Markov-modulated End-to-end Delay Analysis of Large-scale RF-Mesh Networks with Time-slotted ALOHA and FHSS
A new mathematical model and a methodology are proposed to evaluate the performance of large scale
RF-Mesh Networks that use time-slotted ALOHA with Frequency Hopping Spread Spectrum. An
analytic formulation for the delay, based on Markov-modulated modelling of the system, is derived. The
formula can be extended to evaluate other important performance metrics. The proposed methodology
is applied to a large scale network of several thousands of nodes, and numerical results are reported to
show the wide variety of performance evaluations that are enabled. The usefulness of the assessment of
the feasibility of different types of applications (e.g., smart-metering, sensor networks, IoT) is shown.
An analysis of the scalability of this methodology and a comparison with simulation results are also
11h20 - 11h45
Global Inventory Planning with Loosely Coupled Markov Decision processes
We present a general approach to plan the inventory level of slow-moving items where service level targets are applied on a set of items. Loosely coupled Markov decision processes are used within a column generation algorithm with the objective of minimizing overall costs while satisfying service level targets.