Martin Waltz, and Ostap Okhrin. Spatial–temporal recurrent reinforcement learning for autonomous ships. Neural Networks, (165):634-653, 2023. [PUMA: Autonomous COLREG Deep Recurrency, learning, reinforcement surface vehicle] URL