Understanding, Predicting and Better Resolving Q-Value Divergence in Offline-RL

Published in Adavances on Neural Information Processing Systems (Neurips) 2023, 2023

Identify the major causes for the divergence in offline-RL. A comprehensive theory and elegant solution (LayerNorm) is given.