Machine Learning

The Shaped Transformer: Attention Models in the Infinite Depth-and-Width  Limit
Avatar
Lorenzo Noci
222 views