Home About YY Help Changes

Fast Transformer Decoding: One Write-Head is All You Need

https://arxiv.org/abs/1911.02150
Noam Shazeer

Multi query attention

« Paper/Shazeer2019fast »