When and Which Sensor to Observe? Timely Tracking of a Joint Markov Source
2026-06-29 • Information Theory
Information TheoryNetworking and Internet Architecture
AI summaryⓘ
The authors study how to best decide when and which sensor to ask for information to estimate the state of a system that changes over time. Since each sensor has different costs and communication delays, the monitor wants to balance getting fresh info against costs. They create a mathematical model that uses all past info to estimate the current situation, then apply advanced decision-making methods called model predictive control and reinforcement learning. Their results, tested with examples, show these approaches can effectively manage the trade-off between info accuracy and cost.
Remote EstimationMarkov ProcessSensorsAge of Incorrect Information (AoII)Erasure ChannelBelief StateMarkov Decision Process (MDP)Model Predictive Control (MPC)Reinforcement Learning (RL)
Authors
Ismail Cosandal, Sennur Ulukus, Nail Akar
Abstract
We investigate the problem of remote estimation (at a monitor) of a discrete-time joint Markov process with individual components which can be observed with dedicated sensors. At a given time slot, the monitor has the option of staying idle or sending a pull request to one of the sensors to obtain a partial state value, while the sensors are assumed to have heterogeneous sampling costs. Our goal is to develop a monitor pull policy, i.e., determining when and towards which sensor to send a pull request, in order to minimize a weighted sum of average age of incorrect information (AoII), or in short age, and sampling costs. As the communication model, we assume an erasure channel with a fixed one-slot delay from each sensor to the monitor. In this setting, the monitor does not perfectly know either the state of the process or the age, at any given time. We first obtain a sufficient statistic, namely belief, representing the joint distribution of the age and the current state of the observed process, by using the history of all pull requests and observations. Then, we formulate the optimization problem as a continuous state-space Markov decision process (MDP), namely belief-MDP, for the solution of which we propose two model predictive control (MPC) methods, namely MPC without terminal costs (MPC-WTC), and reinforcement learning MPC (RL-MPC). The effectiveness of the proposed methods is validated by numerical examples.