Sho Hoshino


2024

pdf bib
A Single Linear Layer Yields Task-Adapted Low-Rank Matrices
Hwichan Kim | Shota Sasaki | Sho Hoshino | Ukyo Honda
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Low-Rank Adaptation (LoRA) is a widely used Parameter-Efficient Fine-Tuning (PEFT) method that updates an initial weight matrix W0 with a delta matrix 𝛥 W consisted by two low-rank matrices A and B. A previous study suggested that there is correlation between W0 and 𝛥 W. In this study, we aim to delve deeper into relationships between W0 and low-rank matrices A and B to further comprehend the behavior of LoRA. In particular, we analyze a conversion matrix that transform W0 into low-rank matrices, which encapsulates information about the relationships. Our analysis reveals that the conversion matrices are similar across each layer. Inspired by these findings, we hypothesize that a single linear layer, which takes each layer’s W0 as input, can yield task-adapted low-rank matrices. To confirm this hypothesis, we devise a method named Conditionally Parameterized LoRA (CondLoRA) that updates initial weight matrices with low-rank matrices derived from a single linear layer. Our empirical results show that CondLoRA maintains a performance on par with LoRA, despite the fact that the trainable parameters of CondLoRA are fewer than those of LoRA. Therefore, we conclude that “a single linear layer yields task-adapted low-rank matrices.” The code used in our experiments is available at https://github.com/CyberAgentAILab/CondLoRA.

pdf bib
Cross-lingual Transfer or Machine Translation? On Data Augmentation for Monolingual Semantic Textual Similarity
Sho Hoshino | Akihiko Kato | Soichiro Murakami | Peinan Zhang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Learning better sentence embeddings leads to improved performance for natural language understanding tasks including semantic textual similarity (STS) and natural language inference (NLI). As prior studies leverage large-scale labeled NLI datasets for fine-tuning masked language models to yield sentence embeddings, task performance for languages other than English is often left behind. In this study, we directly compared two data augmentation techniques as potential solutions for monolingual STS: - (a): _cross-lingual transfer_ that exploits English resources alone as training data to yield non-English sentence embeddings as zero-shot inference, and - (b) _machine translation_ that coverts English data into pseudo non-English training data in advance. In our experiments on monolingual STS in Japanese and Korean, we find that the two data techniques yield performance on par. In addition, we find a superiority of Wikipedia domain over NLI domain as unlabeled training data for these languages. Combining our findings, we further demonstrate that the cross-lingual transfer of Wikipedia data exhibits improved performance.

2022

pdf bib
Aspect-based Analysis of Advertising Appeals for Search Engine Advertising
Soichiro Murakami | Peinan Zhang | Sho Hoshino | Hidetaka Kamigaito | Hiroya Takamura | Manabu Okumura
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track

Writing an ad text that attracts people and persuades them to click or act is essential for the success of search engine advertising. Therefore, ad creators must consider various aspects of advertising appeals (A3) such as the price, product features, and quality. However, products and services exhibit unique effective A3 for different industries. In this work, we focus on exploring the effective A3 for different industries with the aim of assisting the ad creation process. To this end, we created a dataset of advertising appeals and used an existing model that detects various aspects for ad texts. Our experiments demonstrated %through correlation analysis that different industries have their own effective A3 and that the identification of the A3 contributes to the estimation of advertising performance.

2015

pdf bib
Discriminative Preordering Meets Kendall’s 𝜏 Maximization
Sho Hoshino | Yusuke Miyao | Katsuhito Sudoh | Katsuhiko Hayashi | Masaaki Nagata
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2014

pdf bib
Japanese to English Machine Translation using Preordering and Compositional Distributed Semantics
Sho Hoshino | Hubert Soyer | Yusuke Miyao | Akiko Aizawa
Proceedings of the 1st Workshop on Asian Translation (WAT2014)

2013

pdf bib
Two-Stage Pre-ordering for Japanese-to-English Statistical Machine Translation
Sho Hoshino | Yusuke Miyao | Katsuhito Sudoh | Masaaki Nagata
Proceedings of the Sixth International Joint Conference on Natural Language Processing