TWON

EACL 2026 in Rabat: New Publications on Simulating Social Media Users

At this year’s European Chapter of the Association for Computational Linguistics (EACL) in Rabat, Morocco, Simon Münker, Nils Schwager and Achim Rettinger presented two new TWON papers on the simulation of social media users with Large Language Models (LLMs).

Their work contributes to a growing research field at the intersection of Natural Language Processing, Computational Social Science and social media analysis. Both publications examine how well LLMs can reproduce communication patterns on social networks and under which conditions such simulations can be considered empirically realistic.

The first paper, “Don’t Trust Generative Agents to Mimic Communication on Social Networks Unless You Benchmarked their Empirical Realism”, formalizes the task of simulating social media users and evaluates it through a case study based on German and English Twitter data. The study shows that many current simulation approaches in Computational Social Science rely on comparatively simple methods whose scientific validity is difficult to justify without systematic benchmarking. The results also reveal a clear language bias: English communication patterns are significantly easier to simulate than German ones.

The second paper, “Towards Simulating Social Media Users with LLMs: Evaluating the Operational Validity of Conditioned Comment Prediction”, co-authored with Alistair Plum, builds on this work and extends the analysis to Luxembourgish. The findings indicate that Luxembourgish is even more challenging to simulate than German. This points to a broader issue for multilingual AI research: the smaller a language and the less data available, the more difficult it becomes to model realistic online behavior.

Together, the two publications underline a central challenge for the use of LLMs in social media research. If generative models are used to study online communication, their empirical realism must be carefully validated, especially across different languages and data contexts. The findings make an important contribution to more rigorous, multilingual and methodologically sound approaches in AI supported Computational Social Science.