We study the control of opinion dynamics in large populations when a strategic decision maker repeatedly broadcasts public messages that shape the evolution of opinions. The population state is modeled as a probability distribution over finitely many ideological bins, and its evolution defines a decision-dependent Markov chain on the probability simplex. On the theoretical side, we introduce a class of contractive opinion dynamics in which the linear part combines inertia and peer influence, while an action-dependent source term models the direct impact of messaging. Under explicit contractivity and regularity assumptions, we prove that the deterministic dynamics associated with any fixed policy admit a unique steady-state distribution and extend this result to the stochastic case by showing the existence and uniqueness of an invariant measure on the simplex. On the empirical side, we employ Proximal Policy Optimization (PPO) as a reinforcement learning mechanism for the decision maker. Numerical experiments indicate that PPO can learn communication strategies that substantially reduce long-run polarization and keep the opinion distribution near moderately centralized regimes, although no general convergence guarantees are provided. The simulations further show that the qualitative long-term behavior of the population depends strongly on the messaging pattern: fact-checking and moderate messages tend to promote centralization, whereas predominantly provocative messaging sustains bimodal and more polarized steady or weakly oscillatory distributions. The results illustrate how contraction-based analysis and policy-gradient methods can be combined to study decision-dependent stochastic systems on the space of distributions.

Contractive Opinion Dynamics with Decision-Dependent Transitions: Steady-State Analysis and PPO-Based Policy Exploration

Ricciardi Celsi, Lorenzo
2026-01-01

Abstract

We study the control of opinion dynamics in large populations when a strategic decision maker repeatedly broadcasts public messages that shape the evolution of opinions. The population state is modeled as a probability distribution over finitely many ideological bins, and its evolution defines a decision-dependent Markov chain on the probability simplex. On the theoretical side, we introduce a class of contractive opinion dynamics in which the linear part combines inertia and peer influence, while an action-dependent source term models the direct impact of messaging. Under explicit contractivity and regularity assumptions, we prove that the deterministic dynamics associated with any fixed policy admit a unique steady-state distribution and extend this result to the stochastic case by showing the existence and uniqueness of an invariant measure on the simplex. On the empirical side, we employ Proximal Policy Optimization (PPO) as a reinforcement learning mechanism for the decision maker. Numerical experiments indicate that PPO can learn communication strategies that substantially reduce long-run polarization and keep the opinion distribution near moderately centralized regimes, although no general convergence guarantees are provided. The simulations further show that the qualitative long-term behavior of the population depends strongly on the messaging pattern: fact-checking and moderate messages tend to promote centralization, whereas predominantly provocative messaging sustains bimodal and more polarized steady or weakly oscillatory distributions. The results illustrate how contraction-based analysis and policy-gradient methods can be combined to study decision-dependent stochastic systems on the space of distributions.
2026
Convergence
Opinion dynamics
Proximal policy optimization
Reinforcement learning
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12606/47605
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
social impact