Semantic Preference of the Word Okaa-san and Mama in Tsukuba Web Corpus: A Corpus Linguistic Analysis

.


INTRODUCTION
The development of a language can be shaped by the interplay between different linguistic systems.The consequence of this interaction is the manifestation of lexical borrowings from foreign languages.As stated by recipient language.Throughout the 20th century, the Japanese language has assimilated several linguistic elements, primarily originating from major European languages.According to Kindaiichi (1988), the terminology commonly employed to refer to these borrowed words is 外来語 (gairaigo), which directly translates to 'language of foreign origin'.Nevertheless, in present-day usage, the phrase pertains to loanwords that are transcribed in katakana script, particularly those originating from Europe and America.
Following the Meiji restoration , Japan underwent a transformative period in its international connections, which subsequently brought about notable shifts in the social and cultural fabric of Japanese society.The introduction of English terminology into the Japanese language was significantly influenced by the Japanese occupation following World War II, particularly by English-speaking nations such as the United States.English is often linked to concepts of modernity, technology, and global impact, resulting in the assimilation of numerous English vocabulary words into the Japanese language.According to Shibatani (1990), the predominant origin of loanwords in Japanese is English, accounting for around 80.8% of the vocabulary influx.This establishes English as the primary source of lexical enrichment in the Japanese language.According to Rebuck (2002), loanwords in English from Japanese serve three distinct roles.One aspect is addressing lexical deficiencies, such as the emergence of the term デート (deeto) during the period of American occupation, which signifies a more progressive perspective on interpersonal relationships in a cultural context where お見合 い (omiai) is prevalent.The second function is to the provision of unique effects, shown by the term コーチングスタッフ (koochingu sutaffu), which is employed in Japanese fitness centers to denote a role necessitating specialized expertise.This term is essentially synonymous with the word 指導員 (shidouin), denoting a counselor or instructor.In certain contexts, the term シ ングルマザー (singuru mazaa, meaning 'single mother') is employed as a euphemistic alternative to the phrase 未婚の母 (mikon no mama, meaning 'unmarried mother'), which carries a more negative connotation.
The incorporation of loanwords in Japanese society exhibits a noteworthy correlation.In addition to employing the terms " お 父 さ ん " (otousan) and "お母さん" (okaa-san) to refer to the roles of father and mother, respectively, individuals in Japan also utilize the loan words or gairaigo "パパ " (papa) and " ママ" (mama).According to Passin (1980), there was an observed increase in the usage of the terms パパ (papa) and ママ (mama) throughout the 1860s.The terms in question are perceived to possess a lesser degree of historical significance when compared to " お父さん" (otou-san, meaning 'father') and "お母さん" (okaa-san, meaning 'mother'), which exhibit a more strict and disciplined connotation, often evoking a sense of intimacy or friendliness.Passin's perspective classifies the terms パパ (papa) and ママ (mama) as bridging the lexical void that exists between the notion of parents in American culture and the notion of parents in Japanese culture.
To investigate the emergence of usage differences, the authors employ a corpus-based analysis to examine the semantic preferences of the words "お 母さん" (okaa-san, meaning 'mother') and "ママ" (mama, meaning 'mama').This approach aligns with the viewpoint of Biber et al. (1998), who assert that corpus linguistics possesses the capability to analyze patterns of language use in authentic texts and utilize them as a foundation for analysis.This study examines the collocational patterns of the terms " お母さん" (okaa-san, meaning 'mother') and "ママ" (mama, meaning 'mama') in relation to their semantic preferences.The objective is to acquire a deeper understanding of the semantic connections and correlations between these words within the given corpus.
According to Biber and Kurjian (2007), linguists have just lately started utilizing the internet as a valuable resource for doing more targeted research.Specifically, they have begun using the internet as a corpus to investigate linguistic diversity and usage.The utilization of the internet offers a clear benefit to both individuals with limited knowledge in linguistics and experts in the field.This advantage lies in the extensive availability of linguistic information and data on the online, which can be accessed by anybody with computer connectivity.According to Knight (2015), linguists who specialize in corpus research possess the necessary competence to effectively contribute to the field of digital discourse research.This is due to their ability to construct, analyze, and describe language usage within extensive corpora of digital conversation.
In a prior investigation conducted by Sugeha and Nurfarida (2016), an examination was undertaken to compare the collocations of the terms "ibu" and "ibu" within the Indonesian language corpus.The findings revealed that, in terms of collocation, the word "mother" conveys a positive connotation and is closely associated with the concept of family.However, it also tends to be linked with unfavorable circumstances, such as death.One example of a sentence regarding the closeness of a mother to her family in the world sketch table in this study is "...in the family.The relationship between father, mother and child must be close and strong.Don't …."Meanwhile, the word mother has a meaning related to love and tends to relate to religious matters such as God and prayer.One example is the sentence "…crying profusely.His heart melted: the Mother of God heard the cry of his heart and….".
Moreover, a study conducted by Farida et al. (2023) examined the usage of female nouns in digital mass media.The findings revealed a wide range of female nouns employed in mass media, each exhibiting distinct semantic prosodies.The term "mother" possesses a colloquial usage as a noun, adjective, and verb, typically associated with the domain of women in their reproductive capacity.This includes terms such as giving birth, childbirth, breastfeeding, and pregnancy, which are closely linked to the concept of motherhood.Furthermore, the term "mother" possesses a favorable prosodic quality due to its additional usage as a salutation or a manifestation of deference.Moreover, the term "Putri" exhibits a favorable prosodic quality, albeit with a propensity for a more limited contextual usage due to its association with females, noble lineage, and gender classification.Several collocations that commonly occur with the term "princess" include "mother," "grandmother," "palace," "kingdom," and "youngest." The aforementioned research facilitated the writer's comprehension of how Indonesian society engages in discourse surrounding the role of women and motherhood, and allowed for a comparative analysis with the portrayal of motherhood in Japanese society.This study focuses on the investigation of two Japanese words, namely お母さん (okaa-san 'mother') and ママ (mama 'mama'), which both refer to female parents.The research method employed in this study is similar to that of prior research conducted by Sugeha and Nurfarida.The corpus data utilized in this study was taken from the Tsukuba Web Corpus.The purpose of this study is to examine the prevalence of colloquialisms associated with the terms お母さん (okaa-san) and ママ (mama), and to analyze the patterns of discussion surrounding these terms on

METHOD
To analyze the comparative collocation of the words お母さん (okaasan) and ママ(mama), this research was conducted using a corpus linguistics analysis approach, where corpus linguistics is designed with the principle that every event that occurs repeatedly is considered to have high significance (Djajasudarma & Citraresmana, 2016 ).In corpus linguistics, the meaning of a word can arise when the word has an association with other words that often appear together.This relationship is defined as collocation by Stubbs (2002).
In this study, collocations in the form of colloquials that appear together with the words お母さん (okaa-san) and ママ (mama) are examined to see the pattern of co-occurrence between these words and other words that have similar semantic features.Semantic preferences can help identify the evaluative or attitudinal associations that the words お母さん (okaa-san) and ママ (mama) have with their collocates in the corpus.
The colloquialism of the words お母さん (okaa-san) and ママ (mama) will be examined using a corpus processing site called NLT (NINJAL-LWT for TWC), namely NINJAL-LWT (NINJAL-LagoWordProfiler) which was developed to process data from corpus Tsukuba Web Corpus (TWC) developed by NINJAL and Tsukuba University (NINJAL & Lago Institute of Language, 2012).The TWC corpus has over one billion one hundred thirty eight million tokens derived from a collection of URLs collected via the Yahoo!search site API.In Japan.The data was obtained from 2013 to 2021.By using TWC, it can be directly observed how the collocation of the words お母さん (okaa-san) and ママ (mama) on the internet.To obtain the significant colloquialisms of the two words, by replicating research by Yuliawati (2016), a statistical calculation of the MI score or MI score is used.MI score is a type of corpus statistic that is used to measure the closeness of relationships between words (Collins, 2019).The MI score threshold in this study is MI ≥ 3 and the frequency of occurrence is f ≥ 5.
After the significant colloquial words of the verb and adjective classes in お母さん (okaa-san) and ママ (mama) were obtained, 20 significant colloquial words with the highest MI score were obtained, then they were categorized using automatic software for annotating meaning fields in the corpora.known as the UCREL Semantic Analysis System (UCREL Semantic Analysis System, USAS) developed by Rayson et al. (2004).From these categories, it can be obtained how the words お母さん (okaa-san) and ママ (mama) are discussed.
This study uses a combined method that combines quantitative and qualitative analysis.The combined method used is the sequential explanatory model combination method (Sugiyono, 2016: 416).In the first stage, a quantitative method was used by calculating the closeness of the colloquial relationship using the MI score statistic on NLT.Furthermore, a qualitative method is used by classifying the significant colloquialisms that have been obtained based on the USAS semantic categories to explore how the words お 母さん (okaa-san) and ママ (mama) are discussed in the internet context by Japanese people.

RESULTS AND DISCUSSION
Based on data obtained using the NINJAL-LWP corpus processing site, the word お母さん (okaa-san 'mother') appears 55,670 times in the written form お母さん 53,381 times or 96% and in the written form おかあさん 2,289 times or 4%.Meanwhile, the word ママ (mama 'mama') appears 39,824 times in the written form ママ 100% because there is no other written form of the word ママ (mama 'mama') in the Tsukuba Web Corpus.Based on these data it can be concluded that the word お母さん (okaa-san) is used more by Japanese people on the internet than the word ママ (mama).
Of the 55,670 occurrences, the word お母さん (okaa-san) is collocated with a verb or 動詞 (doushi) 9,035 times and has 1,208 collocates.The colloquial was then limited by a threshold score of MI ≥ 3 and frequency ≥ 5 so that 166 cholokats were obtained.Furthermore, the 20 collocates with the highest MI scores were grouped based on the USAS category from UCREL.The following is a grouping of significant colloquialisms in the form of verbs from the word お母さん(okaa-san) in USAS.Meanwhile, the word ママ (mama) which is collocated with verbs or 動 詞 (doushi) appears 3,703 times and has 741 colloquial words.After being restricted by a threshold score of MI ≥ 3 and frequency ≥ 5, 99 significant collocates were obtained.Furthermore, the 20 significant collocates with the highest MI scores were grouped by USAS category.Table 4 is a grouping of significant colloquialisms in the form of verbs from the word ママ (mama) in USAS.
Furthermore, the word ママ (mama) which has a collocation with Iadjectives or 形容詞 (keiyoushi) appears 322 times with 71 colloquial words.

CONCLUSION
In its use in communicating on the internet, it was found that the word お 母 さ ん (okaa-san) is more discussed in terms of the close relationship between mother and child and emerges with a traditional and conservative attitude, while the word ママ (mama) is more discussed in terms of emotion and tends to act more freely.The difference in these tendencies proves that

Table 3 .
Category of significant collocats in the form of adjective-Na in the word okaa-san