Cooperative Multi-Agent Constrained POMDPs: Strong Duality and Primal-Dual Reinforcement Learning with Approximate Information States. (arXiv:2307.16536v1 [math.OC])

Click here to flash read.

We study the problem of decentralized constrained POMDPs in a team-setting
where the multiple non-strategic agents have asymmetric information. Strong
duality is established for the setting of infinite-horizon expected total
discounted costs when the observations lie in a countable space, the actions
are chosen from a finite space, and the immediate cost functions are bounded.
Following this, connections with the common-information and approximate
information-state approaches are established. The approximate
information-states are characterized independent of the Lagrange-multipliers
vector so that adaptations of the multiplier (during learning) will not
necessitate new representations. Finally, a primal-dual multi-agent
reinforcement learning (MARL) framework based on centralized training
distributed execution (CTDE) and three time-scale stochastic approximation is
developed with the aid of recurrent and feedforward neural-networks as
function-approximators.

Click here to read this post out

ID: 304319; Unique Viewers: 0

Unique Voters: 0

Total Votes: 0

Votes:

Latest Change: Aug. 1, 2023, 7:32 a.m. Changes:

/u/anonymous

Dictionaries:

Words:

Spaces:

CC:
No creative common's license

Comments: