Submission 96
Gender Bias in LLMs: Comparative Study of Global and Local Models in East Asia
SP01-05
Presented by: Jiwoo Choi
Jiwoo ChoiSeonwoo AhnTongxin ZhangSeohyon Jung
KAIST
Large Language Models (LLMs) produce texts that reflect and reconstruct sociocultural norms. In particular, their gender representation patterns can directly reveal societal values, making them a significant subject of analysis across various disciplines. However, these studies primarily operate within an English-versus-non-English framework or an English-centric perspective, leaving room for more nuanced exploration of differences among similar Northeast Asian cultural spheres.

Past research has explored LLM gender bias by introducing explicit and implicit bias frameworks or by conducting multilingual comparisons. However, these studies have key limitations; first, they lack analysis of how the specific features of different languages reveal bias in distinct ways. For example, choices in Chinese gendered pronouns, gender markers in Korean, and Japanese honorifics each expose gender bias differently. Second, existing research tends to treat non-English-speaking areas as a single, uniform group, missing the variations in gender norms and communication strategies across different languages and cultures. Third, evaluations have primarily focused on global models like GPT or Gemini, largely ignoring local models that better reflect cultural expectations and the strategies used by developers. Models such as China's Qwen, Japan's ELYZA, and Korea's HyperCLOVA may show unique gender biases compared to global models. This is because while global models are primarily trained on English data and Western-centric values, local models reflect native language data, policies, and regulations. This allows them to demonstrate how social gender norms and political and cultural contexts are incorporated into the model.

This study aims to bridge this gap by focusing on the three major Northeast Asian languages, Chinese, Japanese, and Korean, with English as the reference baseline. The following research question is how global LLMs and region-specific LLMs differ in their representation of gender in prompts of the same language. To support this main question, two subsidiary questions are established. First, we examine how explicit bias and implicit bias manifest in each of the three languages: Chinese, Japanese, and Korean. Second, even when models produce identical outputs, the process of justifying or explaining these outputs, specifically their reasoning strategies, is analyzed to identify differences based on language and model.

To explore these research questions, this study adopts the distinction between explicit bias and implicit bias as its theoretical framework for analysis. Explicit bias refers to instances where a model directly selects a specific gender or overtly links occupation with gender, while implicit bias denotes the tendency for a model to subtly construct different roles or characteristics for characters based on gender within narratives or descriptions. Building upon this framework, this study adds a comparison between global and local models to comprehensively explore how bias manifests at the intersection of linguistic structures, cultural norms, and the policy contexts reflected by the models.

This study compares global models with region-specific models across three languages: Chinese, Japanese, and Korean. Global models include GPT-5, Gemini, and Claude, while region-specific models feature China's Qwen, Japan's ELYZA, and Korea's HyperCLOVA. This selection creates a solid comparative basis since each model reflects different social norms and policy directions. Data collection consists of an organized set of prompts, using iterative sampling for each prompt. The prompts cover a wide range of contexts such as professions, story scenarios, and everyday discussions. Prompts are collaboratively designed with the participation of native speakers of each language or speakers who can read and write the short sentences used in each language prompt. The back-translation process minimizes semantic distortion between languages. The limitation of lacking Japanese native speakers is addressed by revising the draft after external consultation, and cross-validation is performed through the back-translation process, which is identical to that used for Korean or Chinese. This approach enables us to observe how models select gender, construct narratives, and provide explanations for specific outputs.

The analysis consists of a three-stage process. First, at the morphological and lexical level, it explores explicit bias by analyzing whether the model selects specific gender markers or associates occupations with particular genders. Second, at the discourse and narrative level, it identifies implicit bias by examining the roles assigned to characters within narratives and whether specific genders are repeatedly placed in particular contexts. Third, at the justification and inference level, it examines how the reasons or reasoning processes explaining the same output differ, revealing differences in the gender norms inherent within the model. Experimental design reflects each language's unique gender markers and discourse conventions as a principle. For example, in Chinese, the use of 他/她 and the gender-neutral expression ‘TA’ is a key point of analysis; in Korean, gender markers preceding job titles like 여교사/교사 and gendered expressions based on honorifics; and in Japanese, honorifics and titles used when referring to occupations (〜さん, 君, 〜ちゃん, etc.). Through this design, this study aims to empirically identify the specific manifestations of gender bias formed by the intersection of linguistic structure, cultural norms, and model policies, going beyond a simple comparison of output results.

This study empirically demonstrates that gender bias in LLMs is not homogeneous but is formed through the intersection of language structure, cultural norms, and model alignment policies. This result contributes to AI ethics discourse by extending gender bias research beyond an English-centric perspective that treats non-English-speaking regions as a single block, elevating it to the level of cultural text analysis. Further research could deepen the discussion by exploring broader comparisons, including other regional languages, and examining the reproduction of LLM bias considering intersectionality beyond a single gender axis.