Energy-Based Transformers Are Scalable Learners and Thinkers alexiglad.github.io 2 points by cs702 6 hours ago
See also https://www.reddit.com/r/MachineLearning/comments/1lu1ia0/r_...