Simplifying Transformer Blocks: Block Layouts

Reparameterising value and projection parameters in linear layers via the duality between downweighted residuals and restricted updates optimizes learning rates and model performance.


This content originally appeared on HackerNoon and was authored by Auto Encoder: How to Ignore the Signal Noise

:::info Authors:

(1) Bobby He, Department of Computer Science, ETH Zurich (Correspondence to: bobby.he@inf.ethz.ch.);

(2) Thomas Hofmann, Department of Computer Science, ETH Zurich.

:::

Abstract and Introduction

Related Work

Preliminaries

Simplifying Transformer Blocks

Further Experimental Analysis

Discussion, Reproducibility Statement, Acknowledgements and References

A Duality Between Downweighted Residual and Restricting Updates In Linear Layers

B Block Layouts

C Additional Experiments

D Implementation Details

B BLOCK LAYOUTS

In Fig. 9 and Fig. 10 we show the layouts of our SAS block (Sec. 4.2) and parallel SAS-P block (Sec. 4.3). These are the equivalent plots to the layouts in Fig. 1. Mathematically, our SAS attention sub-block computes (in the notation of Eq. (2)):

\

\

:::info This paper is available on arxiv under CC 4.0 license.

:::

\


This content originally appeared on HackerNoon and was authored by Auto Encoder: How to Ignore the Signal Noise


Print Share Comment Cite Upload Translate Updates
APA

Auto Encoder: How to Ignore the Signal Noise | Sciencx (2024-06-19T14:00:19+00:00) Simplifying Transformer Blocks: Block Layouts. Retrieved from https://www.scien.cx/2024/06/19/simplifying-transformer-blocks-block-layouts/

MLA
" » Simplifying Transformer Blocks: Block Layouts." Auto Encoder: How to Ignore the Signal Noise | Sciencx - Wednesday June 19, 2024, https://www.scien.cx/2024/06/19/simplifying-transformer-blocks-block-layouts/
HARVARD
Auto Encoder: How to Ignore the Signal Noise | Sciencx Wednesday June 19, 2024 » Simplifying Transformer Blocks: Block Layouts., viewed ,<https://www.scien.cx/2024/06/19/simplifying-transformer-blocks-block-layouts/>
VANCOUVER
Auto Encoder: How to Ignore the Signal Noise | Sciencx - » Simplifying Transformer Blocks: Block Layouts. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2024/06/19/simplifying-transformer-blocks-block-layouts/
CHICAGO
" » Simplifying Transformer Blocks: Block Layouts." Auto Encoder: How to Ignore the Signal Noise | Sciencx - Accessed . https://www.scien.cx/2024/06/19/simplifying-transformer-blocks-block-layouts/
IEEE
" » Simplifying Transformer Blocks: Block Layouts." Auto Encoder: How to Ignore the Signal Noise | Sciencx [Online]. Available: https://www.scien.cx/2024/06/19/simplifying-transformer-blocks-block-layouts/. [Accessed: ]
rf:citation
» Simplifying Transformer Blocks: Block Layouts | Auto Encoder: How to Ignore the Signal Noise | Sciencx | https://www.scien.cx/2024/06/19/simplifying-transformer-blocks-block-layouts/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.