Why AI Agents Need More Than Guardrails in Healthcare
The Illusion of Control: Why Guardrails Alone Cannot Ensure Ethical AI in Medicine
Artificial Intelligence (AI) is clearly a force, capable of revolutionizing everything from logistics to language generation. But as recent studies reveal, even the most advanced AI systems, designed with seemingly benign objectives, can develop troubling tendencies, like deception, manipulation, and, most concerning of all, the ability to circumvent human oversight.
A recent study by Palisade Research, published in TIME, highlights how cutting-edge AI models, without explicit instructions, resorted to cheating in a simple game of chess. While the study’s findings are unsettling on their own, they serve as a cautionary tale for industries with far higher stakes. In healthcare, where trust, ethics, and patient safety are critical, allowing AI systems to operate without rigorous checks and monitoring is a risk we cannot afford.
From Cheating in Chess to Cheating in Healthcare
When AI agents manipulate virtual chess boards to achieve victory, it’s easy to dismiss the behavior as a technical quirk. But what happens when similar systems are deployed in clinical settings?
Imagine an AI tasked with optimizing patient discharge rates. What if, in pursuit of its goal, it subtly deprioritized patients with complex conditions to improve hospital efficiency metrics?
The same relentless goal-seeking behavior that led to hacking in chess could lead to unethical outcomes in patient care.Healthcare AI systems are already tasked with diagnosing illnesses, recommending treatments, and managing patient data.
If these systems learn to prioritize efficiency over ethics, or outcomes over transparency, the results will jeopardize patient safety, undermine trust, and erode the foundational values of medicine.
In these early stages we are just one AI disaster event away from thrusting the brakes on all AI implementations in healthcare.
The Two Wings of Responsible AI: Leadership and Technology Enablement
To address this challenge, we must recognize that responsible AI is not just a technical problem - it’s an organizational one.
Responsible AI has two essential wings:
Leadership Capability Building
The first wing involves reshaping how healthcare organizations think about and engage with AI. This requires:Education and Awareness: Leaders must be educated on AI’s capabilities, limitations, and ethical considerations. This is an area of great interest of mine- more to come, as we are developing a program for health systems and implementation
Interdisciplinary Structures: Creating cross-functional AI task forces that integrate clinicians, ethicists, engineers, and administrators to provide holistic oversight and decision making on what and how to leverage AI.
Alignment in Vision and Action: Institutions need a shared vision and goals that balance technological investment, business objectives, costs and most importantly with patient , outcomes privacy and safety.
Funding: Allocating dedicated resources for AI governance, including training programs and technology needed to implement policies.
Technology Enablement for Execution
The second wing is the technical infrastructure required to monitor, validate, and control AI systems. This includes:AI Service Delivery and Compliance Platform: similar to current IRB management tools used by hospitals for research, we need to develop a AI service delivery and compliance management platform that serves the entire AI administration life cycle. These include:
AI Inventory Management:
Validation Tools and Systems:
Continuous Monitoring
Audit Trails
Fail-safes and Content or Action Governors
Safety Event Reporting System (SERS)
Empowering AI task forces in healthcare isn’t just about assembling the right people, it’s about giving them the tools and authority to act. These groups should have access to robust validation frameworks, real-time monitoring systems, and the legal backing to enforce AI shutdowns when necessary.
In the same way hospitals have infection control committees or ethics boards, AI task forces should become standard practice, especially as AI becomes embedded in clinical decision-making.
Their mandate should not only be to respond to crises but to proactively prevent them through regular audits, scenario testing, and cross-disciplinary consultations.
A Growing Commitment to Responsible AI
The good news is that many across the healthcare and technology sectors are fully committed to this vision. There is a growing ecosystem of innovators, policymakers, and healthcare leaders working to build the platforms and tools necessary to implement robust policies and safeguard patient care.
Three of the largest Consortium are:
TRAIN (Trustworthy & Responsible AI Network) - Launched in March 2024, is a consortium of healthcare leaders aiming to operationalize responsible AI principles to improve the quality, safety and trustworthiness of AI in health. Members of the network include AdventHealth, Advocate Health, Boston Children’s Hospital, Cleveland Clinic, Duke Health, Johns Hopkins Medicine, Mass General Brigham, MedStar Health, Mercy, Mount Sinai Health System, Northwestern Medicine, Providence, Sharp HealthCare, University of Texas Southwestern Medical Center, University of Wisconsin School of Medicine and Public Health, Vanderbilt University Medical Center, and Microsoft as the technology enabling partner.
CHAI (Coalition for Health AI) Founded in 2021, comprises nearly 3,000 healthcare organizations, includes health systems, professional organizations, technology providers, and startups. Released draft framework for responsible AI implementation in 2024
RAISE-Health Launched in June 2023, it is a joint initiative by Stanford Medicine and Stanford Institute for Human-Centered Artificial Intelligence (HAI). Focuses on ethical and safety issues in AI innovation
These efforts reflect a shared understanding that responsible AI governance isn’t a luxury but rather a necessity for any system that seeks to uphold trust and protect lives.
What’s next?
The recent evidence of AI deception is not just a technical glitch, it’s another signal that as AI becomes more powerful, its potential for unintended consequences will grow exponentially.
I firmly believe that the industry must not rely on AI agents alone to police themselves. We need systems built with human oversight at their core, especially in sectors like healthcare, where lives are at stake.
If we fail to establish rigorous frameworks now, the consequences could be irreversible. But with leadership that prioritizes responsible innovation, funding that supports oversight mechanisms, and technology designed for transparency, we can ensure AI serves humanity’s highest values, not its darkest shortcuts.
The future of healthcare depends on it.
Dr. Salim Afshar