TWON researchers Damian Trilling and Sjoerd Stolwijk presented their research on bias in generative AI text annotations at IC2S2 – the International Conference on Computational Social Science in Norrköping, Sweden. Their poster, titled “Are Generative AI text annotations systematically biased?”, addressed the question of whether generative large language models produce biased annotations when processing texts that include identity terms.
Their findings show that generative large language models can perform well in text annotation tasks, but that their annotations are not neutral across all contexts. The models treated identity terms differently, and annotation outcomes were influenced by the specific identity group mentioned in the text. The study also found systematic and predictable identity-based bias, although the patterns differed depending on the model used.
These results raise important methodological concerns for researchers using generative AI tools for text annotation. If different researchers rely on different models, undetected model-specific biases may lead to contrasting results. At the same time, if many researchers use the same model, their findings may be biased in similar ways. The poster therefore highlights the need for careful evaluation, transparency and critical reflection when applying generative AI in computational social science.
