cs

DL·ML/Study

    Reinforcement Learning Basics

    PoliciesA policy is a rule that determines what action to take, typically denoted as $μ$. When the action is selected stochastically, the policy is represented specifically as $π(⋅|s_t)$ at timestep $t$When the policy is based on stochastic process, the action is sampled categorically if the action space is discrete, and sampled in a Guassian manner if the action space is continuous.Value Functi..