RNNs vs. Transformers: Innovations in Scalability and Efficiency Post date January 14, 2025 Post author By Gating Post categories In ai-research, deep-learning, efficient-ai, linear-attention, rnn-models, scalable-ai, ssm-models, transformers
Training speed on longer sequences Post date January 14, 2025 Post author By Gating Post categories In ai-models, deep-learning, hawk-and-griffin-models, language-models, nlp-research, rnn-models, scalable-ai, transformers
Efficient linear recurrences on device Post date January 14, 2025 Post author By Gating Post categories In ai-research, custom-kernel, deep-learning, efficient-training, hawk-model, rg-lru-layer, scalable-ai, tpu-optimization
Efficient Training: Scaling Griffin Models for Large-Scale AI on TPUs Post date January 14, 2025 Post author By Gating Post categories In ai-model-scaling, ai-research, deep-learning, efficient-training, griffin-model, model-parallelism, scalable-ai, tpu-optimization
Griffin Models: Outperforming Transformers with Scalable AI Innovation Post date January 13, 2025 Post author By Gating Post categories In ai-research, chinchilla-scaling, deep-learning, efficient-ai, griffin-model, rnn-models, scalable-ai, transformers
RNN Models Hawk and Griffin: Transforming NLP Efficiency and Scaling Post date January 13, 2025 Post author By Gating Post categories In ai-models, deep-learning, efficient-ai, language-models, multi-query-attention, nlp-research, rnn-model, scalable-ai