Hiroyuki Kaneko

2024

Understanding How Positional Encodings Work in Transformer Model
Taro Miyazaki | Hideya Mino | Hiroyuki Kaneko
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

A transformer model is used in general tasks such as pre-trained language models and specific tasks including machine translation. Such a model mainly relies on positional encodings (PEs) to handle the sequential order of input vectors. There are variations of PEs, such as absolute and relative, and several studies have reported on the superiority of relative PEs. In this paper, we focus on analyzing in which part of a transformer model PEs work and the different characteristics between absolute and relative PEs through a series of experiments. Experimental results indicate that PEs work in both self- and cross-attention blocks in a transformer model, and PEs should be added only to the query and key of an attention mechanism, not to the value. We also found that applying two PEs in combination, a relative PE in the self-attention block and an absolute PE in the cross-attention block, can improve translation quality.

pdf bib

HamNoSys-based Motion Editing Method for Sign Language
Tsubasa Uchida | Taro Miyazaki | Hiroyuki Kaneko
Proceedings of the LREC-COLING 2024 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources

pdf bib

Sign Language Translation with Gloss Pair Encoding
Taro Miyazaki | Sihan Tan | Tsubasa Uchida | Hiroyuki Kaneko
Proceedings of the LREC-COLING 2024 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources

Co-authors

Venues

Fix author