How the Sausage is Made: LLM-embedding-based Heuristic Hierarchical Clustering For Organizing TeaP2025 Talk & Poster Sessions

Tue—Casino_1.811—Poster2—5513

Presented by: Seung-Goo Kim

Seung-Goo Kim ¹^*, Daniela Sammler ^{1, 2}

¹ Research Group Neurocognition of Music and Language, Max Planck Institute for Empirical Aesthetics, Frankfurt am Main, Germany, ² Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany

Organizing talks and poster presentations into scientifically coherent sessions is non-trivial at a large scale. Motivated by recent applications of large language models (LLMs) in topic extraction and clustering of scientific literature, we were interested in whether an LLM could reduce human efforts (e.g., ~40 HIWI-hours) at a reasonable cost (e.g., <1 USD) for this very conference, TeaP2025.
We developed a semi-automated procedure that exploits abstract-embedding vectors generated by an LLM to perform iterative hierarchical clustering. This approach aims to find optimal cluster structures that obey the physical constraints of the conference (e.g., a maximal number of talks per session; the number of parallel sessions; the number of lecture rooms) while maximizing the utilitarian utility (i.e., a linear sum of individual utilities) of attendees and presenters (i.e., minimizing conflict of parallel talk sessions; maximizing coherence of adjacent posters).
The results suggest that an LLM-based algorithm can produce better-than-nothing groundwork. However, it still requires considerable post-processing from human experts. We conclude that LLM-based tools are not a magic wand but can be helpful to some extent when combined with human expertise.

Keywords: human-machine-interaction, LLM