09:30 - 11:10
PS6
Room:
Room: Club C
Panel Session 6
Charles 1, William Marx - GPT-2 Generates Biased Texts
Sebastian Block - Cross-Domain Classification Of Political Texts: Introducing A Lean And Versatile Two-Step Workflow
Alona Dolinsky - Multilingualism in Computational Text Analysis Methods—Evidence From A Pre-Registered Survey Experiment.
Hauke Licht - Going cross-lingual: A guide to multilingual text analysis
Multilingualism in Computational Text Analysis Methods—Evidence From A Pre-Registered Survey Experiment.
PS6-3
Presented by: Alona Dolinsky, Martijn Schoonvelde, Mariken A.C.G. van der Velden
Alona Dolinsky 1Martijn Schoonvelde 1Mariken A.C.G. van der Velden 2
1 University College Dublin
2 Vrije Universiteit Amsterdam
With social and political science an increasingly international enterprise (Wilson & Knutsen, 2020), a growing number of studies published in leading journals investigate (political) phenomena occurring in countries where English is not the main language. Following other types of comparative research, it is no surprise then that studies using computational text analysis methods (CTAM) are increasingly conducted in an international, inter-lingually comparative fashion (e.g., Lind et al., 2019). However, this rapid internationalization is hardly reflected in the development of CTAM. Owing to the considerable head start of English-language computational development—both due to early U.S. dominance in computer technology and due to the specific role of English as scientific lingua franca—many resources, tools, and training required for cutting-edge CTAM development are available only, or in far better quality, for English-language text analysis. This present study delves deeper into the relationship between languages and the use of CTAM methods, utilizing a novel dataset derived from a pre-registered survey of 421 social science researchers. Specifically, we examine how the language of analysed texts influences researchers’ validity concerns about the use of CTAM methods. We also examine what motivations prevent researchers from examining multilingual textual corpora. Our results will inform the debates on strengthening the link between CTAM tool development on the one hand and social science research on the other.