pfannschmidt

Out now: Our new demonstrator tool TWONderland

In the past weeks, our TWON researcher Fabio Sartori (KIT) and his colleagues worked on a new demonstrator tool to make the dynamics of Online Social Networks tangible for the broad public. The result is: TWONderland!

In our simulation TWONderland, we assign the user the job as the lead designer of a new Online Social Network. In a playful and interactive way, users explore how as the platform designer, they influence the interaction on the platform and how even the tiniest design choices can ripple out to shape behavior, sentiments and relationships between the users – and potentially spark fragmentation and fuel polarization.

Unique about this demonstrator is the step-by-step walkthrough of the functionalities of Online Social Networks (OSNs). The user starts by assigning moods – from aggressive to calm – to fictive platform users. We then visualize how their fictive users are connected to each other on the platform, and how their moods adapt as they are confronted with posts of each other. In TWONderland, every OSN user participates within a specific sentiment corridor, meaning that they will interact with and adapt to other users as long as their differences in sentiment are not too significant. Here, for instance, a very calm user would not immediately interact with somebody who is very aggressive. However, in our demonstrator, we visualize that the sentiment on a platform can still shift in positive and negative directions gradually. These network dynamics were modelled based on the Axelrod model (for further information and technicalities please refer to our Deliverable).

After getting an understanding of network dynamics, the user is asked to experiment with alternative platform mechanisms that determine what users (and their moods) influence their own fictive platform user. Based on the ranking algorithms the user sets, posts with different moods – again, aggressive to calm – will become visible to their fictive character, which influence their mood. From this individual level, the demonstrator then moves on to visualizing bigger networks in which many users influence each other based on the designated platform mechanics. To understand how users influence each other’s mood on OSNs, the user can run comparative simulations and experiment how polarization is fueled or minimized only through the ranking mechanics.

New paper by TWON researcher Simon Münker: Fingerprinting LLMs through Survey Item Factor Correlation: A Case Study on Humor Style Questionnaire

We are proud to announce that our researcher Simon Münker published a new paper with the title: Fingerprinting LLMs through Survey Item Factor Correlation: A Case Study on Humor Style Questionnaire. It is published in the Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing and the results will be presented in Shanghai on 5 November.

LLMs increasingly engage with psychological instruments, yet how they represent constructs internally remains poorly understood. Simon Münker introduces a novel approach to “fingerprinting” LLMs through their factor correlation patterns on standardized psychological assessments to deepen the understanding of LLMs constructs representation. Using the Humor Style Questionnaire as a case study, he analyzes how six LLMs represent and correlate humor-related constructs to survey participants. His results show that they exhibit little similarity to human response patterns. In contrast, participants’ subsamples demonstrate remarkably high internal consistency. Exploratory graph analysis further confirms that no LLM successfully recovers the four constructs of the Humor Style Questionnaire. These findings suggest that despite advances in natural language capabilities, current LLMs represent psychological constructs in fundamentally different ways than humans, questioning the validity of application as human simulacra.

It’s a wrap: CitizenLab 2025 in Chemnitz

On 8 October, we hosted another CitizenLab in the Stadthallenpark in Chemnitz, where we got to speak with citizens about our research on Online Social Networks.

We presented our demonstrators MicroTWONY, MacroTWONY, and TWONderland to interested citizens and participants, had inspiring conversations about the impact of Online Social Networks on society and democracy, as well as possibilities for regulation and ethical design. We are glad to see how many participants enjoyed experimenting with the demonstrators and exploring how digital dynamics become tangible!

In the evening, we joined an interesting event on memory culture in digital spaces at the NSU Documentation Center with TWON researcher Jonas Fegert, journalist Nhi Le and Susanne Siegert from the channel @keineerinnerungskultur, moderated by Benjamin Fischer. The discussion focused on the opportunities social networks offer for democratic education, especially for younger audiences, and on the limitations imposed by platform mechanisms that tend to amplify hate speech and misinformation.

A day full of dialogue, reflection, and future perspectives – thank you for everybody who was a part of it, and we’re looking forward to the next CitizenLab!

New publication: Can we use automated approaches to measure the quality of online political discussion?

We’re proud to announce that our consortium members Sjoerd Stolwijk, Damian Trilling (both University of Amsterdam) and Simon Münker (Trier University) contributed to a freshly published paper on measuring the debate quality of online political discussions. The paper was released in the “Communication Methods and Measures” journal by Routledge and is open access.

Our researchers review how debate quality has been measured in communication science, and systematically compare 50 automated metrics against numerous manually coded comments. Based on their experiments, they were able to give clear recommendations for how to (not) measure debate quality in terms of interactivity, diversity, rationality, and (in)civility according to Habermas.

Their results show that transformer models and generative AI (like Llama and GPT-models) outperform older methods, yet there is variance and the success depends on the measured concept, as some (e.g. rationality) remain difficult to capture also by human coding. Which measure should be preferred for future empirical applications is likely dependent on the
objective of the study in question. For some genres, language and communication style (e.g. satire), it is strongly advised to test the accuracy of automated methods against the human interpretation beforehand, even if methods are widely used. Some approaches and implementations performed so poorly that they are not suitable for studying debate quality.

Panel discussion: TWON researcher Jonas Fegert on “Who owns AI? On democratization, control and power relations”

On July 14th, TWON researcher Jonas Fegert (FZI Research Center for Information Technology), was invited as a panelist to the event “Who owns AI? On democratization, control and power relations” hosted by the House for Journalism and the Public Sphere in Berlin. The panel discussion explored how artificial intelligence can be shaped and governed democratically and what social, political and technological conditions are needed to make that possible.

At the heart of the discussion were fundamental questions about power structures in the field of AI. Today, artificial intelligence influences many areas of life, from work and education to everyday decision-making. Yet major developments in this space are often driven by large tech corporations without meaningful input from democratic institutions or the public. The panel reflected on what it could mean to democratize AI, who should have a say in its direction and what roles parliaments, research institutions and civil society can play in this process.

The event offered a valuable opportunity to engage with international experts from philosophy, social science and technology ethics. Many thanks to the organizers for the invitation and the insightful discussion.

Recap: The 1st Workshop on Semantic Generative Agents on the Web (SemGenAge 2025)

June 2nd, 2025 – Portorož, Slovenia | Part of ESWC 2025

The 1st Workshop on Semantic Generative Agents on the Web, held on June 2nd in Portorož, Slovenia, as part of the Extended Semantic Web Conference (ESWC 2025), marked a key milestone in disseminating the goals and findings of the TWON project to the academic community. The event brought together researchers from diverse disciplines to explore how Semantic Web technologies and Large Language Models (LLMs) can be combined to develop intelligent, interpretable, and communicative agents for the web.

Opening Keynote

The workshop opened with a keynote by Matthias Nickles (National University of Ireland, Galway), who presented a comprehensive overview of the history and recent advancements in generative agents within computer science, setting the stage for the diverse presentations to follow.

Paper Presentations

Jan Lorenz (Constructor University) kicked off the presentations with a talk on Filter Bubbles in an Agent-Based Model Where Agents Update Their Worldviews with LLMs. His work replaced abstract numerical opinion spaces with LLM-generated human-like statements to simulate opinion dynamics. The goal was to assess whether filter bubbles would still emerge in this more realistic setting and to examine the practical integration of LLMs into agent-based simulations.

Next, Martin Žust (Jožef Stefan Institute) presented a web-based negotiation agent designed to assist unskilled negotiators in real time. The agent transcribes dialogue, builds dynamic world models, and combines analytical reasoning with human-like intuition to offer context-aware negotiation support. This hybrid approach aims to enhance interpersonal outcomes through collaborative human-AI interaction.

Abdul Sittar (Jožef Stefan Institute) followed with an agent-based simulation of social media engagement during German elections. By incorporating past conversational history, motivational factors, and resource constraints, the model used fine-tuned AI to generate posts and replies, applying sentiment analysis, irony detection, and offensiveness classification. The findings highlighted how historical context shapes AI responses and how behavior shifts under different temporal constraints.

Afternoon Keynote and Talks

In the afternoon keynote, Denisa Reshef Kera (Bar-Ilan University) addressed philosophical perspectives on generative agents, focusing on bias, representation, and agency. She emphasized the role of generative agents in public policy and civic participation, highlighting their potential for enhancing digital society.

Ljubisa Bojic (University of Belgrade) presented an innovative AI-based recommender system designed to reduce echo chambers and polarization. His model incorporates emotional tone, content diversity, and political balance into the recommendation process, improving content exposure without sacrificing accuracy. The approach aligns with ethical AI principles, offering user autonomy through customizable preferences.

Denisa Reshef Kera returned with Avital Dotan to present AI Beyond Rules, Heuristics, and Dreams, introducing the concept of ergative-absolutive AI agents. Drawing on linguistic structures from languages like Basque, they proposed a new way of modeling agency in LLMs—treating them not just as predictors but as performative systems that enact grammar and interaction. Their two-step framework involves analyzing grammatical alignments and creating participatory simulations with diverse agent alignment patterns to encourage adaptive and accountable behavior.

Simon Münker (University of Trier) concluded the paper presentations with twony, a micro-simulation platform that models emotional contagion and discourse dynamics in online social networks. Using fine-tuned BERT models and LLMs to simulate politically engaged personas, twony visualizes emotional cascades in various feed algorithm scenarios—offering a powerful, open-source tool for explaining polarization and online behavior.

Closing Discussion

The workshop concluded with a fishbowl discussion featuring Achim Rettinger, Damian Trilling, Marko Grobelnik, Matthias Nickles, and Denisa Reshef Kera. The panel reflected on the interdisciplinary insights presented throughout the day and discussed future directions for generative agents in real-world applications.

Takeaways

SemGenAge 2025 fostered rich dialogue across fields including semantic web technologies, AI, computational social science, and digital media studies. Discussions emphasized the potential of generative agents in areas such as online discourse moderation, content recommendation, opinion shaping, and consumer behavior analysis.

The workshop’s insights will directly support TWON’s mission: combining empirical observations, simulation, and participatory methods to create evidence-based recommendations for improving social network regulation and enhancing digital citizenship.

For full program details, visit the official workshop page.

Fifth TWON Consortium Meeting in Portorož, Slovenia

From May 30 – June 1st, all nine TWON partner institutions gathered in Portorož, Slovenia for the fifth TWON Consortium Meeting. This in-person event offered a key opportunity to assess our progress, align goals, and prepare for the final year of the project.

The meeting began with the general assembly led by consortium leader Damian Trilling, where we reviewed project milestones, discussed ongoing challenges, and set priorities for the months ahead. A central objective was to optimize integration across TWON’s thematic and methodological strands, reinforcing the coherence of our collective efforts. We then focused on planning our large-scale simulations — from sharpening research questions to technical implementation. The day concluded with a consortium dinner.

On Sunday, we had a workshop on design features of a democracy-preserving online social network, as a step towards developing policy recommendations. Later on, we had a session for plannung the upcoming Citien Labs, where we discuss our research with citizens. The day concluded with a plenary wrap-up, and an early-career researcher event, where we had the chance to discuss and feedback our PhD projects with each other.

The Portorož meeting not only advanced TWON’s agenda but also reinforced collaboration at a critical stage of the project. With renewed momentum, the consortium is well-prepared for the final project year.

Thank you, Jozef Stefan Institut for hosting us in beautiful Slovenia!

TWON report: Defining metrics for democratic online discourse

Our researcher Sjoerd Stolwijk recently published a deliverable, proposing a set of metrics to determine the deliberative quality of discussions on social media in general, and TWON in particular.

The report lists the key indicators of: (1) Exposure to political content, (2) Engagement with political content, (3) Contributing political content, (4) Diversity of exposure and (5) Quality of exposure.

It is then explained how and why this set of indicators differs from the typical list of deliberative indicators and proposes to view deliberation from a summative rather than an additive perspective. In this view, social media do not need to aim at perfect deliberation within one platform; rather, the goal is to contribute to deliberation at a societal scale via the platform. 

We propose that social media can contribute especially by offering an avenue for users (citizens, journalists and politicians alike) to be exposed to political debate, but also to engage and participate in that debate. In addition, social media can connect otherwise unconnected users and expose them to ideas they might otherwise have missed. Ideally, these ideas are substantiated with arguments and evidence.

Our researcher evaluates a large set of automatic classifiers to determine the degree to which social media comments meet several deliberative criteria, specifically whether comments are rational, interactive, diverse and civil.

Results show how more modern techniques like fine-tuned transformers and generative large language models have improved our ability to reproduce manual codings automatically, but also that results vary considerably between models.

We then integrate the aims of Chapter 3 with the results of Chapter 4 and translate them to the case of TWON to arrive at the metrics proposed in Chapter 2. It adds tests of the performance of different classifiers to determine whether a comment is political or not.

Finally, we take a look into the future, beyond what is currently feasible for TWON, to explore whether new techniques can help determine the deliberative quality of online social media debates to the more fine-grained level of specific claims and show some promising first results.

Download the Deliverable here.

On Regulating Online Social Networks: TWON Policy Brief #2

In January 2025, the TWON consortium developed a second TWON policy brief, on regulating online social networks! The briefing was developed in a comprehensive process with academic input, was then enriched with citizens’ perspectives from the DialoguePerspectives Citizen Lab in Fall 2024 and reviewed multiple times by academics in the consortium. 

The briefing focuses on funding research and the development of public platforms, promoting content diversity through algorithmic design, platform regulation to strengthen interoperability and transparency of platforms, as well as the promotion of media literacy and support of independent journalism.

See the full briefing below &

Download the TWON Policy Brief #2 here

The DSA’s research access: a flawed system

The Digital Service Act does not do enough for research access. While its article §40 implements a duty for very large online platforms (VLOPs) and very large online search engines (VLOSEs) to allow research access, it is not sufficient.
That is why the TWON collaborated with Digits EU, and the Digital Law Institute Trier to feedback the commission on loopholes in this paragraph. The result is an official statement, that the commission now considers.

Their main points of criticism are:

Data access to what? Allow specific requests, as data on minor algorithm changes for specific user groups is currently enclosed. Broaden the definition of systemic risks and create transparency around A/B tests, as they hold great research value.

Data access for whom? Peer review is a standard research process, so peer access to data sets is necessary. Also, allow short-notice data-reaccess to react after peer review, enable group-verifications and create a clear definition of the researcher status to prevent too high barriers.

Verification of data? Currently there is no control mechanism to ensure that the provided data by VLOPs and VLOSEs is correct. An obligation to provide correct datasets needs to be implemented.

Find the full statement here.