Bálint Gyevnár

Safeguarding scientific integrity in the age of AI scientists

prof_pic.jpg
Postdoctoral Research Associate at Carnegie Mellon University

I am interested in the metascience of AI in science and the science in AI. My research uses computational (e.g. algorithms) and empirical methods (e.g. cognitive experiments) of understanding how AI can be used to safely and sustainably enhance human scientific discovery. I am working with Atoosa Kasirzadeh and Nihar Shah at the intersection of machine learning, cognitive science, and the philosophy of science. During my PhD, I worked on explainable multi-agent reinforcement learning under the supervision of Stefano Albrecht, Shay Cohen, and Chris Lucas at the University of Edinburgh, Scotland.

I am currently most curious about three questions:

  • How do we protect scientific integrity in the age of agentic AI scientists? What are the methodological pitfalls of scientific AI systems, and how do we create controlled computational tools with statistical guarantees to prevent them?
  • How does the use of AI affect the epistemics of science? What are the dangers of delegating scientific thinking to AI systems, how does AI change the high-level conceptual and low-level methodological steps of science, are we in danger of illusions of understanding?
  • How do scientific communities change as the result of AI adoption? What are the effects of AI-accelerated scientific discovery on research communities’ interests, what are scientists’ beliefs and desires for AI in science, how is the scientific community of knowledge affected?

If you are curious about any of the above topics, then don’t hesitate to reach out through the various channels at the bottom of this page! I am currently based in Pittsburgh, PA, USA.

(My name is pronounced BAH-lint [baːlint])

news

Jun 07, 2026 I attended the 10th Center for Human-Compatible AI Workshop discussing our work on failure modes of automated R&D systems.
Jun 04, 2026 Our new preprint on AI Epistemic Risks: Emerging Mechanisms & Evidence involving a broad range of scientists, including Yoshua Bengio, is now available.
Mar 21, 2026 Our new preprint Bridging the Gap in the Responsible AI Divides with Atoosa Kasirzadeh is available on arXiv.
Mar 12, 2026 I gave an invited talk at IVADO in Montréal on “Human and AI Solution Paths in Formalizing Expert Mathematics” the Workshop on Social Reasoning and the Ecology of Thought.
Mar 01, 2026 I attended the 2nd Conference of The International Association for Safe & Ethical AI (IASEAI 2026) at the UNESCO House, Paris, where I have chaired the track on moral competence in AI.
Dec 19, 2025 Our paper Integrating Counterfactual Simulations with Language Models for Explaining Multi-Agent Behaviour was accepted to AAMAS 2026.
Dec 06, 2025 I attended FAR.AI Alignment Workshop in San Diego just before NeurIPS, presenting my poster titled We Need a Rigorous Metascience of Artificial Intelligence.

latest posts

selected publications

  1. Bridging the Gap in the Responsible AI Divides
    Bálint Gyevnár, and Atoosa Kasirzadeh
    In arXiv:2603.14495, Mar 2026
  2. Integrating Counterfactual Simulations with Language Models for Explaining Multi-Agent Behaviour
    In Proceedings of the 25th International Conference on Autonomous Agents and Multiagent Systems, Paphos, Cyprus, May 2026
  3. AI Safety for Everyone
    Bálint Gyevnár*, and Atoosa Kasirzadeh*
    Nature Machine Intelligence, Apr 2025
  4. CHI
    People Attribute Purpose to Autonomous Vehicles When Explaining Their Behavior: Insights from Cognitive Science for Explainable AI
    In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, Apr 2025