About

I am a postdoctoral researcher at the ELLIS Institute Tübingen (co-affiliated with the Max Planck Institute for Intelligent Systems). I am part of the Cooperative Machine Intelligence for People-Aligned Safe Systems (COMPASS) group, led by Sahar Abdelnabi. My research focuses on enhancing trustworthy, safe, and effective reasoning in language models and agentic systems.

I did my Ph.D. in Machine Learning & Natural Language Processing at UKP Lab in TU Darmstadt, supervised by Prof. Iryna Gurevych. During my Ph.D., I interned at Parameter Lab, where we worked with Naver AI on trustworthy AI.

Before my Ph.D., I worked at the Coleridge Initiative, where we organized the Kaggle Competition Show US the Data. I got my master’s degree from the School of Computing at KAIST, where I was a research assistant at IR&NLP Lab and advised by Professor Sung-Hyon Myaeng.

Education

Ph.D. in Computer Science, TU Darmstadt, 2026 (expected)
M.S. in Computer Science, KAIST, 2021
B.Sc. in Computer Science & Engineering (Summa Cum Laude), University of Malaga, 2017

News

Jul 2026	Co-organizing the PALM Workshop at NeurIPS 2026.
Jun 2026	Received the Top Reviewer Award at the ICML 2026 Workshop on AI for Good (AI4GOOD).
May 2026	New preprint: Models That Know How Evaluations Are Designed Score Safer.
Mar 2026	Started as a postdoctoral researcher at the ELLIS Institute Tübingen (COMPASS group).
Feb 2026	New preprint: From Leaky Thoughts to Private Reasoning: Controlling What LRMs Say to Themselves.
Dec 2025	C-SEO Bench presented at NeurIPS 2025 Datasets & Benchmarks.
Nov 2025	Leaky Thoughts presented at EMNLP 2025.
Jul 2025	Fine-Tuning on Diverse Reasoning Chains presented at ACL 2025.