Global optimality of Elman-type RNNs in the mean-field regime

Abstract

We analyze Elman-type recurrent neural networks (RNNs) and their training in the mean-field regime. Specifically, we show convergence of gradient descent training dynamics of the RNN to the corresponding mean-field formulation in the large width limit. We also show that the fixed points of the limiting infinite-width dynamics are globally optimal, under some assumptions on the initialization of the weights. Our results establish optimality for feature-learning with wide RNNs in the mean-field regime.

BibTeX key: pmlr-v202-agazzi23a
entry type: inproceedings
booktitle: Proceedings of the 40th International Conference on Machine Learning
year: 2023
month: 23--29 Jul
pages: 196--227
publisher: PMLR
series: Proceedings of Machine Learning Research
volume: 202
pdf: https://proceedings.mlr.press/v202/agazzi23a/agazzi23a.pdf
Document: https://proceedings.mlr.press/v202/agazzi23a.html

PUMA

Global optimality of Elman-type RNNs in the mean-field regime

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on