Publications

Aggregation Artifacts in Subjective Tasks Collapse Large Language Models Posteriors

Under review

Our results indicate that aggregation is a confounding factor in the modeling of subjective tasks, and advocate focusing on modeling individuals instead. However, aggregation does not explain the entire gap between ICL and the state of the art, meaning other factors in such tasks also account for the observed phenomena. Finally, by rigorously studying annotator-level labels, we find that it is possible for minority annotators to both better align with LLMs and have their perspectives further amplified.

Recommended citation: Chochlakis, Georgios, Alexandros Potamianos, Kristina Lerman, and Shrikanth Narayanan. "Aggregation Artifacts in Subjective Tasks Collapse Large Language Models Posteriors." arXiv preprint arXiv:2410.13776 (2024). https://arxiv.org/abs/2410.13776

The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition

Published in ACII, 2024

In this work, we design experiments and propose measurements to explicitly quantify the consistency of proxies of LLM priors and their pull on the posteriors. We show that LLMs have strong yet inconsistent priors in emotion recognition that ossify their predictions. We also find that the larger the model, the stronger these effects become. Our results suggest that caution is needed when using ICL with larger LLMs for affect-centered tasks outside their pre-training domain and when interpreting ICL results.

Recommended citation: Chochlakis, Georgios, Alexandros Potamianos, Kristina Lerman and Shrikanth Narayanan. “The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition.” (2024). https://arxiv.org/abs/2403.17125

Larger Language Models Dont Care How You Think: Why Chain-of-Thought Prompting Fails in Subjective Tasks

Under review

In this work, we examine whether “enabling” reasoning also retrieves reasoning priors that remain relatively unchanged despite the evidence in the prompt. We find that, surprisingly, CoT indeed suffers from the same posterior collapse as ICL for larger language models.

Recommended citation: Chochlakis, Georgios, Niyantha Maruthu Pandiyan, Kristina Lerman, and Shrikanth Narayanan. "Larger Language Models Dont Care How You Think: Why Chain-of-Thought Prompting Fails in Subjective Tasks." arXiv preprint arXiv:2409.06173 (2024). https://arxiv.org/abs/2409.06173

Tackling missing modalities in audio-visual representation learning using masked autoencoders

Published in InterSpeech, 2024

Audio-visual representations leverage information from both modalities to produce joint representations. Such representations have demonstrated their usefulness in a variety of tasks. However, both modalities incorporated in the learned model might not necessarily be present all the time during inference. In this work, we study whether and how we can make exist- ing models, trained under pristine conditions, robust to partial modality loss without retraining them. We propose to use a curriculum trained Masked AutoEncoder, to impute features of missing input segments. We show that fine-tuning of classification heads with the imputed features makes the base models robust on multiple downstream tasks like emotion recognition and Lombard speech recognition. Among the 12 cases evaluated, our method outperforms strong baselines in 10 instances.

Recommended citation: Chochlakis, Georgios, Chandrashekhar Lavania, Prashant Mathur and Kyu Han. “Tackling missing modalities in audio-visual representation learning using masked autoencoders. Interspeech 2024. https://www.amazon.science/publications/tackling-missing-modalities-in-audio-visual-representation-learning-using-masked-autoencoders

Socio-Linguistic Characteristics of Coordinated Inauthentic Accounts

Published in ICWSM, 2024

In this work, we utilize heuristics to identify coordinated inauthentic accounts and detect attitudes, concerns and emotions within their social media posts, collectively known as socio-linguistic characteristics.

Recommended citation: Burghardt, Keith, Ashwin Rao, Siyi Guo, Zihao He, Georgios Chochlakis, Baruah Sabyasachee, Andrew Rojecki, Shri Narayanan, and Kristina Lerman. "Socio-Linguistic Characteristics of Coordinated Inauthentic Accounts." arXiv preprint arXiv:2305.11867 (2023). https://arxiv.org/abs/2305.11867

Using Emotion Embeddings to Transfer Knowledge Between Emotions, Languages, and Annotation Formats

Published in ICASSP, 2023

In this work, we study how we can build a single emotion recognition model that can transition between different configurations, i.e., languages, emotions, and annotation formats, by leveraging multilingual models and Demux.

Recommended citation: Chochlakis, Georgios, Gireesh Mahajan, Sabyasachee Baruah, Keith Burghardt, Kristina Lerman, and Shrikanth Narayanan. "Using Emotion Embeddings to Transfer Knowledge between Emotions, Languages, and Annotation Formats." In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1-5. IEEE, 2023. https://arxiv.org/abs/2211.00171

Leveraging Label Correlations in a Multi-label Setting: A Case Study in Emotion

Published in ICASSP, 2023

We develop two modeling approaches to emotion recognition in order to capture word associations of the emotion words themselves, by either including the emotions in the input, or by leveraging Masked Language Modeling (MLM). Second, we integrate pairwise constraints of emotion representations as regularization terms alongside the classification loss of the models.

Recommended citation: Chochlakis, Georgios, Gireesh Mahajan, Sabyasachee Baruah, Keith Burghardt, Kristina Lerman, and Shrikanth Narayanan. "Leveraging label correlations in a multi-label setting: A case study in emotion." In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1-5. IEEE, 2023. https://arxiv.org/abs/2210.15842

VAuLT: Augmenting the Vision-and-Language Transformer for Sentiment Classification on Social Media

Posted on arXiv, 2022

We propose the Vision-and-Augmented-Language Transformer (VAuLT). VAuLT is an extension of the popular Vision-and-Language Transformer (ViLT), with the key insight being to propagate the output representations of a large language model like BERT to the language input of ViLT.

Recommended citation: Chochlakis, G.; Srinivasan, T.; Thomason, J.; and Narayanan, S. 2022. VAuLT: Augmenting the Vision-and-Language Transformer for Sentiment Classification on Social Media. arXiv preprint arXiv:2208.09021. https://arxiv.org/abs/2208.09021

CLiMB: A Continual Learning Benchmark for Vision-and-Language Tasks

Published in NeurIPS, 2022

Continual learning is a challenging learning setting, but has been underexplored in the vision-and-language domain. We introduce CLiMB🧗, the Continual Learning in Multimodality Benchmark, to enable the development of multimodal models that learn continually.

Recommended citation: Srinivasan, T., Chang, T. Y., Alva, L. L. P., Chochlakis, G., Rostami, M., & Thomason, J. (2022). CLiMB: A Continual Learning Benchmark for Vision-and-Language Tasks. Thirty-sixth Conference on Neural Information Processing Systems. https://arxiv.org/abs/2206.09059

Generating Elevation Surface from a single RGB remotely sensed image using Deep Learning

Published in Remote Sensing, 2020

End-to-end approach to construct a remotely sensed image to elevation surface mapping using Conditional Generative Adversarial Networks

Recommended citation: Panagiotou E, Chochlakis G, Grammatikopoulos L, Charou E. Generating Elevation Surface from a Single RGB Remotely Sensed Image Using Deep Learning. Remote Sensing. 2020; 12(12):2002. https://www.mdpi.com/2072-4292/12/12/2002/pdf