Siew Yeng Chow


2024

pdf bib
This Word Mean What: Constructing a Singlish Dictionary with ChatGPT
Siew Yeng Chow | Chang-Uk Shin | Francis Bond
Proceedings of the 2nd Workshop on Resources and Technologies for Indigenous, Endangered and Lesser-resourced Languages in Eurasia (EURALI) @ LREC-COLING 2024

Despite the magnitude of recent progress in natural language processing and multilingual language modeling research, the vast majority of NLP research is focused on English and other major languages. This is because recent NLP research is mainly data-driven, and there is more data for resource-rich languages. In particular, Large Language Models (LLM) make use of large unlabeled datasets, a resource that many languages do not have. In this project, we built a new, open-sourced dictionary of Singlish, a contact variety that contains features from English and other local languages and is syntactically, phonologically and lexically distinct from Standard English (Tan, 2010). First, a list of Singlish words was extracted from various online sources. Then using an open Chat-GPT LLM API, the description, including the defintion, part of speech, pronunciation and examples was produced. These were then refined through post processing carried out by a native speaker. The dictionary currently has 1,783 entries and is published under the CC-BY-SA license. The project was carried out with the intention of facilitating future Singlish research and other applications as the accumulation and management of language resources will be of great help in promoting research on the language in the future.

2022

pdf bib
Singlish Where Got Rules One? Constructing a Computational Grammar for Singlish
Siew Yeng Chow | Francis Bond
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Singlish is a variety of English spoken in Singapore. In this paper, we share some of its grammar features and how they are implemented in the construction of a computational grammar of Singlish as a branch of English grammar. New rules were created and existing ones from standard English grammar of the English Resource Grammar (ERG) were changed in this branch to cater to how Singlish works. In addition, Singlish lexicon was added into the grammar together with some new lexical types. We used Head-driven Phrase Structure Grammar (HPSG) as the framework for this project of a creating a working computational grammar. As part of building the language resource, we also collected and formatted some data from the internet as part of a test suite for Singlish. Finally, the computational grammar was tested against a set of gold standard trees and compared with the standard English grammar to find out how well the grammar fares in analysing Singlish.