[PyTorch] Zero-initialize learnable softmax_offset in DotProductAttention#2694
Open
fjosw wants to merge 1 commit intoNVIDIA:mainfrom
Open
[PyTorch] Zero-initialize learnable softmax_offset in DotProductAttention#2694fjosw wants to merge 1 commit intoNVIDIA:mainfrom
fjosw wants to merge 1 commit intoNVIDIA:mainfrom