Understanding, Predicting and Better Resolving Q-Value Divergence in Offline-RL
Published in Adavances on Neural Information Processing Systems (Neurips) 2023, 2023
Identify the major causes for the divergence in offline-RL. A comprehensive theory and elegant solution (LayerNorm) is given.