Indexed on: 05 Feb '16Published on: 05 Feb '16Published in: Mathematics - Probability
We propose a model of network formation based on reinforcement learning, which can be seen as a generalization as the one proposed by Skyrms for signaling games. On a discrete graph, whose vertices represent individuals, at any time step each of them picks one of its neighbors with a probability proportional to their past number of communications; independently, Nature chooses, with an independent identical distribution in time, which ones are allowed to communicate. Communications occur when any two neighbors mutually pick each other and are both allowed by Nature to communicate. Our results generalize the ones obtained by Hu, Skyrms and Tarr\`es (2011). We prove that, up to an error term, the expected rate of communications increases in average, and thus a.s. converges. If we define the limit graph as the non-oriented subgraph on which edges are pairs of vertices communicating with a positive asymptotic rate, then, for stable configurations, within which every vertex is connected to at least another one, the connected components of this limit graph are star-shaped and satisfy a certain balance condition. Conversely, given any stable equilibrium $q$ whose associated graph satisfies that property, the occupation measure converges with positive probability to a stable equilibrium in a neighborhood of $q$ with the same limit graph.