Page Not Found
Page not found. Your pixels are in another canvas.
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Page not found. Your pixels are in another canvas.
About me
This is a page not in th emain menu
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
|  |  Worked together between April 2022 and November 2022. Currently at xAI.  | 
|  |  Worked together between September 2022 and May 2025. Currently at Harvard University (MSc).  | 
|  |  Worked together between February 2023 and June 2023. Currently at Salesforce.  | 
|  |  Worked together between April 2024 and May 2025. Currently at fundamental.ai.  | 
|  |  Worked together between February 2025 and May 2025. Currently at LinkedIn.  | 
Short description of portfolio item number 1
Short description of portfolio item number 2 
Published in Remote Sensing, 2020
End-to-end approach to construct a remotely sensed image to elevation surface mapping using Conditional Generative Adversarial Networks
Recommended citation: *Panagiotou E, *Chochlakis G, Grammatikopoulos L, Charou E. Generating Elevation Surface from a Single RGB Remotely Sensed Image Using Deep Learning. Remote Sensing. 2020; 12(12):2002. https://www.mdpi.com/2072-4292/12/12/2002/pdf
Posted on arXiv:2102.04379, 2021
Reducing Zero-shot Learning to Few-shot Learning using the generator of a generative Zero-shot Learning approach for end-to-end learning
Recommended citation: Chochlakis G., Georgiou E., Potamianos A. End-to-end Generative Zero-shot Learning via Few-shot Learning. arXiv preprint arXiv:2102.04379, 2021. https://arxiv.org/pdf/2102.04379.pdf
Published in NeurIPS, 2022
Continual learning is a challenging learning setting, but has been underexplored in the vision-and-language domain. We introduce CLiMB🧗, the Continual Learning in Multimodality Benchmark, to enable the development of multimodal models that learn continually.
Recommended citation: Srinivasan, T., Chang, T. Y., Alva, L. L. P., Chochlakis, G., Rostami, M., & Thomason, J. (2022). CLiMB: A Continual Learning Benchmark for Vision-and-Language Tasks. Thirty-sixth Conference on Neural Information Processing Systems. https://arxiv.org/abs/2206.09059
Posted on arXiv, 2022
We propose the Vision-and-Augmented-Language Transformer (VAuLT). VAuLT is an extension of the popular Vision-and-Language Transformer (ViLT), with the key insight being to propagate the output representations of a large language model like BERT to the language input of ViLT.
Recommended citation: Chochlakis, G.; Srinivasan, T.; Thomason, J.; and Narayanan, S. 2022. VAuLT: Augmenting the Vision-and-Language Transformer for Sentiment Classification on Social Media. arXiv preprint arXiv:2208.09021. https://arxiv.org/abs/2208.09021
Published in ICASSP, 2023
We develop two modeling approaches to emotion recognition in order to capture word associations of the emotion words themselves, by either including the emotions in the input, or by leveraging Masked Language Modeling (MLM). Second, we integrate pairwise constraints of emotion representations as regularization terms alongside the classification loss of the models.
Recommended citation: Chochlakis, Georgios, Gireesh Mahajan, Sabyasachee Baruah, Keith Burghardt, Kristina Lerman, and Shrikanth Narayanan. "Leveraging label correlations in a multi-label setting: A case study in emotion." In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1-5. IEEE, 2023. https://arxiv.org/abs/2210.15842
Published in ICASSP, 2023
In this work, we study how we can build a single emotion recognition model that can transition between different configurations, i.e., languages, emotions, and annotation formats, by leveraging multilingual models and Demux.
Recommended citation: Chochlakis, Georgios, Gireesh Mahajan, Sabyasachee Baruah, Keith Burghardt, Kristina Lerman, and Shrikanth Narayanan. "Using Emotion Embeddings to Transfer Knowledge between Emotions, Languages, and Annotation Formats." In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1-5. IEEE, 2023. https://arxiv.org/abs/2211.00171
Published in ICWSM, 2024
In this work, we utilize heuristics to identify coordinated inauthentic accounts and detect attitudes, concerns and emotions within their social media posts, collectively known as socio-linguistic characteristics.
Recommended citation: Burghardt, Keith, Ashwin Rao, Siyi Guo, Zihao He, Georgios Chochlakis, Baruah Sabyasachee, Andrew Rojecki, Shri Narayanan, and Kristina Lerman. "Socio-Linguistic Characteristics of Coordinated Inauthentic Accounts." arXiv preprint arXiv:2305.11867 (2023). https://arxiv.org/abs/2305.11867
Published in InterSpeech, 2024
Audio-visual representations leverage information from both modalities to produce joint representations. Such representations have demonstrated their usefulness in a variety of tasks. However, both modalities incorporated in the learned model might not necessarily be present all the time during inference. In this work, we study whether and how we can make exist- ing models, trained under pristine conditions, robust to partial modality loss without retraining them. We propose to use a curriculum trained Masked AutoEncoder, to impute features of missing input segments. We show that fine-tuning of classification heads with the imputed features makes the base models robust on multiple downstream tasks like emotion recognition and Lombard speech recognition. Among the 12 cases evaluated, our method outperforms strong baselines in 10 instances.
Recommended citation: Chochlakis, Georgios, Chandrashekhar Lavania, Prashant Mathur and Kyu Han. “Tackling missing modalities in audio-visual representation learning using masked autoencoders." Interspeech 2024. https://www.amazon.science/publications/tackling-missing-modalities-in-audio-visual-representation-learning-using-masked-autoencoders
Published in ACII, 2024
In this work, we design experiments and propose measurements to explicitly quantify the consistency of proxies of LLM priors and their pull on the posteriors. We show that LLMs have strong yet inconsistent priors in emotion recognition that ossify their predictions. We also find that the larger the model, the stronger these effects become. Our results suggest that caution is needed when using ICL with larger LLMs for affect-centered tasks outside their pre-training domain and when interpreting ICL results.
Recommended citation: Chochlakis, Georgios, Alexandros Potamianos, Kristina Lerman and Shrikanth Narayanan. “The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition.” In 2024 12th International Conference on Affective Computing and Intelligent Interaction (ACII). https://arxiv.org/abs/2403.17125
Published in ICASSP, 2025
In this work, we examine whether “enabling” reasoning also retrieves reasoning priors that remain relatively unchanged despite the evidence in the prompt. We find that, surprisingly, CoT indeed suffers from the same posterior collapse as ICL for larger language models.
Recommended citation: Chochlakis, Georgios, Niyantha Maruthu Pandiyan, Kristina Lerman, and Shrikanth Narayanan. "Larger Language Models Dont Care How You Think: Why Chain-of-Thought Prompting Fails in Subjective Tasks." In ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1-5. IEEE, 2025. https://arxiv.org/abs/2409.06173
Published in Main Proceedings of NAACL, 2025
Our results indicate that aggregation is a confounding factor in the modeling of subjective tasks, and advocate focusing on modeling individuals instead. However, aggregation does not explain the entire gap between ICL and the state of the art, meaning other factors in such tasks also account for the observed phenomena. Finally, by rigorously studying annotator-level labels, we find that it is possible for minority annotators to both better align with LLMs and have their perspectives further amplified.
Recommended citation: Chochlakis, Georgios, Alexandros Potamianos, Kristina Lerman, and Shrikanth Narayanan. "Aggregation Artifacts in Subjective Tasks Collapse Large Language Models Posteriors." In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL). https://aclanthology.org/2025.naacl-long.284/#
Published in InterSpeech, 2025
Multimodal deep learning methods have greatly accelerated research in emotion recognition and have become the state of the art. However, in many scenarios, not all modalities are readily available, leading to either failure of traditional algorithms or the need for multiple models. In this work, we advance the state of the art in emotion recognition by proposing a unified, modality-agnostic transformer-based model that is inherently robust to missing modalities. To better exploit the multimodality of the data, we propose to use contrastive learning for modality alignment and masked autoencoding for multimodal reconstruction. Experimental results on the MSP-Podcast corpus show that our unified model achieves state-of-the-art performance, and improves both unimodal and multimodal baselines by 1-5% relative in respective evaluation metrics with the capability to handle missing modalities for two emotion recognition tasks in a more compact model.
Recommended citation: Chochlakis, Georgios, Turab Iqbal, Woo Hyun Kang, Zhaocheng Huang. "Modality-Agnostic Multimodal Emotion Recognition using a Contrastive Masked Autoencoder" Interspeech 2025. https://www.isca-archive.org/interspeech_2025/chochlakis25_interspeech.pdf
Under review
We propose Semantic F1 Scores, novel evaluation metrics for subjective or fuzzy multi-label classification that quantify semantic relatedness between predicted and gold labels. Unlike the conventional F1 metrics that treat semantically related predictions as complete failures, Semantic F1 incorporates a label similarity matrix to compute soft precision-like and recall-like scores, from which the Semantic F1 scores are derived. Unlike existing similarity-based metrics, our novel two-step precision-recall formulation enables the comparison of label sets of arbitrary sizes without discarding labels or forcing matches between dissimilar labels.
Recommended citation: Chochlakis, Georgios, Jackson Trager, Vedant Jhaveri, Nikhil Ravichandran, Alexandros Potamianos, Shrikanth Narayanan. "Semantic F1 Scores: Fair Evaluation Under Fuzzy Class Boundaries." arXiv preprint arXiv:2509.21633 https://arxiv.org/abs/2509.21633
To appear in Main Proceedings of EMNLP, 2025
We introduce the Label-in-a-Haystack setting: the query and its label(s) are included in the demonstrations shown to LLMs, which are prompted to predict the label(s) again, while receiving task-specific instructions (e.g., emotion recognition) rather than label copying. We show how the failure to copy the label(s) to the output of the LLM are task-relevant and informative. Building on this, we propose the Label-in-a-Haystack Rectification (LiaHR) framework for subjective label correction.
Recommended citation: Chochlakis, Georgios, Peter Wu, Arjun Bedi, Marcus Ma, Kristina Lerman, and Shrikanth Narayanan. "Humans Hallucinate Too: Language Models Identify and Correct Subjective Annotation Errors With Label-in-a-Haystack Prompts." arXiv preprint arXiv:2505.17222 https://arxiv.org/abs/2505.17222
To appear in Main Proceedings of EMNLP, 2025
We investigate how autoregressive LLMs perform multi-label classification, with a focus on subjective tasks, by analyzing the output distributions of the models in each generation step. We find that their predictive behavior reflects the multiple steps in the underlying language modeling required to generate all relevant labels as they tend to suppress all but one label at each step.
Recommended citation: Marcus Ma*, Georgios Chochlakis*, Niyantha Maruthu Pandiyan, Jesse Thomason, and Shrikanth Narayanan. "Large Language Models Do Multi-Label Classification Differently." arXiv preprint arXiv:2505.17510 https://arxiv.org/abs/2505.17510
An ArgumentParser that supports your grid-search needs.
The Semantic F1 scores are novel evaluation metrics for subjective or fuzzy multi-label classification that quantify semantic relatedness between predicted and gold labels.
More information about the event here (in Greek).
I was invited to talk to the senior AI leadership of CapitalOne about my research and our future directions, stemming from the collaboration of CapitalOne and USC and my fellowship. Given the sensitive nature of the discussions, I unfortunately cannot share more details [or pictures :(].
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.