Google Replaces BERT Self-Attention with Fourier Transform: 92% Accuracy, 7 Times Faster on GPUs | Synced


Transformer architectures have come to dominate the natural language processing (NLP) field since their 2017 introduction. One of the only limitations to transformer application is the huge computational overhead of its key component — a self-attention mechanism that scales with quadratic complexity with regard to sequence length.New research from a Google team proposes replacing theContinue Reading
Read more at Synced | AI Technology & Industry Review…