About
I am a postdoctoral researcher at the ELLIS Institute Tübingen (co-affiliated with the Max Planck Institute for Intelligent Systems). I am part of the Cooperative Machine Intelligence for People-Aligned Safe Systems (COMPASS) group, led by Sahar Abdelnabi. My research focuses on enhancing trustworthy, safe, and effective reasoning in language models and agentic systems.
I did my Ph.D. in Machine Learning & Natural Language Processing at UKP Lab in TU Darmstadt, supervised by Prof. Iryna Gurevych. During my Ph.D., I interned at Parameter Lab, where we worked with Naver AI on trustworthy AI.
Before my Ph.D., I worked at the Coleridge Initiative, where we organized the Kaggle Competition Show US the Data. I got my master’s degree from the School of Computing at KAIST, where I was a research assistant at IR&NLP Lab and advised by Professor Sung-Hyon Myaeng.
Education
- Ph.D. in Computer Science, TU Darmstadt, 2026
- M.S. in Computer Science, KAIST, 2021
- B.Sc. in Computer Science & Engineering (Summa Cum Laude), University of Malaga, 2017
News
| May 2026 | New preprint: Models That Know How Evaluations Are Designed Score Safer. |
| Mar 2026 | Started as a postdoctoral researcher at the ELLIS Institute Tübingen (COMPASS group). |
| Feb 2026 | New preprint: Controllable Reasoning Models are Private Thinkers. |
| Dec 2025 | C-SEO Bench presented at NeurIPS 2025 Datasets & Benchmarks. |
| Nov 2025 | Leaky Thoughts presented at EMNLP 2025. |
| Jul 2025 | Fine-Tuning on Diverse Reasoning Chains presented at ACL 2025. |
