Bálint Gyevnár

Safeguarding scientific integrity in the age of AI co-scientists

prof_pic.jpg
Postdoctoral Research Associate at Carnegie Mellon University

My primary research area is the science of AI co-scientists: how they amplify existing biases in research production, create epistemic blind spots, and homogenise approaches across disciplines. I work with Atoosa Kasirzadeh and Nihar Shah as a member of the Institute for Complex Social Dynamics led by Kevin Zollman.

During my PhD, I worked on explainable multi-agent reinforcement learning, which I like to describe as the study of giving interacting AI agents the ability to explain themselves. I researched how we can explain complex emergent behaviour in multi-agent systems (MAS) via the use of counterfactual reasoning. I was supervised by Stefano Albrecht, Shay Cohen, and Chris Lucas.

If you are curious about any of the above topics, then don’t hesitate to reach out through the various channels at the bottom of this page! I am currently based in Pittsburgh, PA.

(My name is pronounced BAH-lint [baːlint])


news

Sep 15, 2025 I have started as a postdoc at Carnegie Mellon University at the Institute of Complex Social Dynamics working with Atoosa Kasirzadeh and Nihar Shah.
Aug 10, 2025 I spent a week visiting the Center for Humans and Machines led by Iyad Rahwan at the Max Planck Institute, Berlin.
Jul 30, 2025 I attended the 2025 Human-aligned AI Summer School in Prague, oragnised by the Alignment of Complex Systems Research and the Center for Theoretical Study at Charles University.
Jun 18, 2025 I attended the 2025 Bridging Responsible AI Divides (BRAID) Gathering in Manchester.
Jun 11, 2025 I attended RLDM 2025, the Multi-disciplinary Conference on Reinforcement Learning and Decision Making, in Dublin, where I have presented a poster on Objective Metrics for Explainable RL paper.
Jun 07, 2025 I gave a talk and presented a poster at the 9th Center for Human-Compatible AI Workshop on “AI Safety for Everyone”.
May 26, 2025 New preprint paper titled: Integrating Counterfactual Simulations with Language Models for Explaining Multi-Agent Behaviour.

selected publications

  1. AI Safety for Everyone
    Balint Gyevnar*, and Atoosa Kasirzadeh*
    Nature Machine Intelligence, Apr 2025
  2. CHI
    People Attribute Purpose to Autonomous Vehicles When Explaining Their Behavior: Insights from Cognitive Science for Explainable AI
    In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, Apr 2025
  3. Causal Explanations for Sequential Decision-Making in Multi-Agent Systems
    In Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, Auckland, New Zealand, May 2024
  4. Bridging the Transparency Gap: What Can Explainable AI Learn From the AI Act?
    Balint Gyevnar, Nick Ferguson, and Burkhard Schafer
    In 26th European Conference on Artificial Intelligence, Sep 2023