The Reward Function and the Least Cost Principle for Gravitation and other Laws of Physics
Abstract
If the universe follows a specific design, then a central question is which cost function is optimized by the observed forces. This is the problem of inverse optimal control, or inverse reinforcement learning, in which a reward function is inferred from the dynamics of the observed system. We first establish the {\em least cost principle}, whereby the laws of motion can be derived from minimization of a time-discounted integral of the acceleration cost minus a state-dependent reward function. After determining the functional form of the acceleration cost from basic principles, we infer the reward function from the laws of motion governing classical gravitation and Coulomb forces. The inferred reward function is high when pairs of particles have high relative velocities and when their relative motion is orthogonal to their distance vectors. All in all, our work suggests that relative motion and quasi-circular orbits are the dynamical and static features optimized by central forces in nature.
Source: arXiv:2603.25444v1 - http://arxiv.org/abs/2603.25444v1 PDF: https://arxiv.org/pdf/2603.25444v1 Original Link: http://arxiv.org/abs/2603.25444v1