About

I am a postdoctoral researcher at the ELLIS Institute Tübingen (co-affiliated with the Max Planck Institute for Intelligent Systems). I am part of the Cooperative Machine Intelligence for People-Aligned Safe Systems (COMPASS) group, led by Sahar Abdelnabi. My research focuses on enhancing trustworthy, safe, and effective reasoning in language models and agentic systems.

I did my Ph.D. in Machine Learning & Natural Language Processing at UKP Lab in TU Darmstadt, supervised by Prof. Iryna Gurevych. During my Ph.D., I interned at Parameter Lab, where we worked with Naver AI on trustworthy AI.

Before my Ph.D., I worked at the Coleridge Initiative, where we organized the Kaggle Competition Show US the Data. I got my master’s degree from the School of Computing at KAIST, where I was a research assistant at IR&NLP Lab and advised by Professor Sung-Hyon Myaeng.

Education

  • Ph.D. in Computer Science, TU Darmstadt, 2026
  • M.S. in Computer Science, KAIST, 2021
  • B.Sc. in Computer Science & Engineering (Summa Cum Laude), University of Malaga, 2017

News

May 2026New preprint: Models That Know How Evaluations Are Designed Score Safer.
Mar 2026Started as a postdoctoral researcher at the ELLIS Institute Tübingen (COMPASS group).
Feb 2026New preprint: Controllable Reasoning Models are Private Thinkers.
Dec 2025C-SEO Bench presented at NeurIPS 2025 Datasets & Benchmarks.
Nov 2025Leaky Thoughts presented at EMNLP 2025.
Jul 2025Fine-Tuning on Diverse Reasoning Chains presented at ACL 2025.

Selected Publications

Katharina Deckenbach*, Haritz Puerto*, Jonas Geiping, Sahar Abdelnabi
arXiv preprint 2026
AI EvaluationsAI Safety
Haritz Puerto, Martin Gubri, Tommaso Green, Seong Joon Oh, Sangdoo Yun
NeurIPS D&B 2025
Trustworthy AIAI Evaluations
Tommaso Green, Martin Gubri, Haritz Puerto, Seong Joon Oh, Sangdoo Yun
EMNLP 2025
ReasoningTrustworthy AI
Haritz Puerto, Tilek Chubakov, Xiaodan Zhu, Harish Tayyar Madabushi, Iryna Gurevych
ACL 2025
Reasoning
Haritz Puerto, Martin Gubri, Sangdoo Yun, Seong Joon Oh
Findings of NAACL 2025
Trustworthy AIAI Evaluations
Haritz Puerto, Martin Tutek, Somak Aditya, Xiaodan Zhu, Iryna Gurevych
EMNLP 2024
Reasoning

→ Full publication list