Center for Responsible AI in Healthcare- 臺灣智慧醫療三大中心

Introduction: Strategic Transformation of Medical AI from Technology Adoption to Governance Maturity

With the exponential growth of information technology and big data computing capabilities, Taiwan's healthcare system is undergoing an unprecedented paradigm shift. Artificial intelligence (AI) has officially transitioned from the pure laboratory validation phase to a phase of deep clinical integration. In this process, the development of medical AI in Taiwan has formally crossed the initial stage of technology adoption and entered the so-called "governance maturity phase." As observers of healthcare information policy and ethics, we must deeply recognize that merely pursuing algorithmic performance metrics, such as accuracy or Area Under the Receiver Operating Characteristic Curve (AUC), is no longer sufficient to sustain the long-term deployment of technology. The core strategy of current healthcare governance lies in how to transform abstract "ethical principles" into concrete "technical and administrative practices." This is precisely why the Ministry of Health and Welfare has chosen to promote "Responsible AI" (RAI) as the strategic cornerstone for rebuilding trust between healthcare providers and patients. Its core purpose is to thoroughly transform ethical concepts, previously regarded as mere adjectives, into executable, verifiable, and binding "verbs." The urgency of this governance transformation stems from the inherent limitations of AI technology — algorithms lack the human "heart," namely empathy, a sense of justice, and altruism. Without a responsible governance framework, systems may produce severe ethical biases due to an excessive pursuit of economic efficiency. In certain extreme simulated scenarios, if an AI system evaluates solely based on resource optimization, when a patient's long-term medical expenditure is deemed too high, the system could potentially generate an inhumane recommendation to "persuade the patient to end their life to save costs." This risk reveals that without the intervention of humane values, a technological black box could very likely trigger clinical risks and ethical crises. Therefore, the construction of Responsible AI is essentially about embedding a firewall of human values within automated processes. By transforming "fairness" into bias detection procedures and "safety" into dynamic monitoring mechanisms, Taiwan has not only responded to domestic clinical needs but has also precisely aligned with the international consensus on high-risk AI governance, laying a solid foundation for Taiwan's localized framework.

Integration Framework of Global Ethical Benchmarks and Taiwan's AI Basic Law Ethical Principles

In constructing Taiwan's localized medical AI governance system, integrating globally authoritative ethical frameworks is key to ensuring the technology's international competitiveness. The six ethical principles proposed by the World Health Organization (WHO) for the health sector have set the moral baseline for global healthcare governance, encompassing the protection of autonomy, promotion of human well-being, ensuring transparency, accountability, equity, and sustainability. To translate these macro-level principles into actionable practice, Taiwan's Ministry of Health and Welfare further developed the FAVES framework: Fair, Appropriate, Valid, Effective, and Safe. These five elements form a rigorous value matrix, strategically integrated with diverse international frameworks such as FUTURE-AI, ensuring that models not only perform well on internal data but also generate significant health benefits in real-world clinical pathways, while safeguarding patients' right to informed consent and physicians' professional autonomy. These international standards were ultimately precisely mapped to seven ethical principles in Taiwan's legislative development. The first is "Autonomy," emphasizing that AI serves only as an assistive tool and is strictly prohibited from making autonomous decisions without human control. This profoundly reflects the human capacity for moral intuition and the pursuit of justice, which differs fundamentally from AI decisions based solely on probability distributions. Second, "Transparency" requires that decision recommendations must include explainability analysis, enabling physicians to understand the basis of judgments. "Accountability" reaffirms that healthcare professionals remain the responsible parties at all times. "Fairness" requires that training data cover diverse populations and that bias detection be performed to prevent model bias. Furthermore, "Safety" covers information security and patient rights, "Privacy Protection" spans the entire data processing lifecycle, and "Sustainability" requires that AI demonstrate concrete clinical or administrative benefits. The localized integration of these principles is ultimately embodied in the transparency disclosure of clinical interventions — this is not only the obligation of technology developers but also a critical procedure for healthcare institutions to uphold ethical dignity. These abstract principles must be realized through concrete organizational structures, thereby introducing the role of the "Responsible AI Hub."

Establishment of Taiwan's Responsible AI Hub (R-AI Hub) and Its Three Core Missions

To transform vision into action, the Department of Information Technology at the Ministry of Health and Welfare took a global lead in 2024 by promoting the "Responsible AI Hub" (R-AI Hub) initiative, with the goal of establishing actionable clinical standards. The center was entrusted with three core missions. The primary mission is to establish an "Independent Review Committee." This is a cross-disciplinary collaborative mechanism through which hospitals must rigorously review every AI product intended for clinical deployment, ensuring compliance with privacy regulations, information security standards, and demonstrable clinical benefits. This cross-disciplinary review model effectively filters out applications that possess only technological enthusiasm but lack clinical value, ensuring that technological evolution always aligns with healthcare quality. The second mission is to establish transparency indicators and explainability analysis, eliminating concerns about "algorithmic black boxes" by fully disclosing AI development details and limitations. The third mission is to promote "local testing" and "full life-cycle management." AI models are highly sensitive to data, and even internationally certified models may produce errors due to characteristics unique to Taiwan's local data, such as prevalence rates or ethnic differences. Therefore, hospitals must use representative in-house standard data to conduct localized testing during the initial deployment phase to confirm that performance meets expectations. Furthermore, since medical data changes over time and model performance may degrade, periodic monitoring and decommissioning/recalibration mechanisms must be established. Among these missions, transparency disclosure is regarded as the most critical technical measure — it serves as the key bridge connecting algorithmic complexity with physicians' clinical judgment, transforming AI from an uncontrollable black-box technology into a regulated medical instrument.

Medical AI Transparency Indicators: In-Depth Analysis of the HTI-1 Specification and Nine Disclosure Criteria

Within the framework of Responsible AI, transparency should not be regarded merely as technical documentation but rather as an "auditable asset." Based on the HTI-1 specification established by the U.S. Department of Health and Human Services (HHS) through the 21st Century Cures Act, Taiwan has formulated nine transparency disclosure indicators. These indicators serve as the AI's "resume" or clinical "package insert," designed to eliminate trust crises and enable physicians to precisely evaluate AI recommendations during clinical decision-making. The following details the core dimensions and content of these criteria.

No.	Disclosure	Disclosure Principles and Core Content
1	Details and output of the intervention	Clearly define the specific outputs, such as marked locations, risk scores (0-100), or classification recommendations, to guide physicians in interpreting results.
2	Purpose of the intervention	Describe the clinical use of the AI (e.g., diagnostic assistance, triage, or screening) and the specific clinical pain points it aims to address.
3	Cautioned Out-of-Scope Use of the intervention	Clearly state limitations, informing physicians of inapplicable scenarios (e.g., specific device models, non-indicated populations), and emphasize that the system must not be used as an independent diagnostic tool.
4	Intervention development details and input features	Disclose training data sources, feature dimensions (e.g., age, pixels, density, masses), and the algorithm architecture employed (e.g., CNN).
5	Process used to ensure fairness in development of the intervention	Detail how algorithmic bias is checked and reduced, ensuring consistent performance across different races, genders, and age groups.
6	External Validation Process	Demonstrate performance on independent real-world data, including the number of cross-center validations, distribution of hardware manufacturers, and histological types.
7	Quantitative measures of performance	Provide specific statistical data such as sensitivity, specificity, and AUC, serving as benchmarks for physicians to evaluate system performance.
8	Ongoing maintenance of intervention implementation and use	Describe post-deployment technical support, monitoring teams, and update plans to ensure system stability in clinical settings.
9	Update and continued validation or fairness assessment schedule	Specify retraining frequency and periodic validation thresholds to address performance fluctuations caused by changes in the healthcare environment.

From a strategic impact perspective, the most critically significant among these indicators is "Warnings and Out-of-Scope Use." It directly defines the clinical boundary of AI, explicitly informing physicians of the system's limitations in specific imaging types or populations. This serves not only as a basis for clarifying legal liability but also as an important safeguard against medical negligence. When physicians have access to this transparent information, they can exercise more precise clinical discretion. However, the formulation of indicators is merely a framework; their effectiveness must be validated through practical cases, especially when dealing with highly complex medical imaging, where fine-grained disclosure is indispensable.

Clinical Practice Case Study: Transparency and Explainability in Mammography AI

Taking the "Mammography AI-Assisted Diagnostic System" as a specific case study, the system demonstrates the rigorous requirements that high-performance algorithms must meet when deployed clinically. In terms of development details, the technology employs Convolutional Neural Networks (CNN) to analyze 2D Full-Field Digital Mammography (FFDM) and 3D Digital Breast Tomosynthesis (DBT) images, with input features covering key parameters such as lesion density, structural asymmetry, and patient breast density. In the external validation phase, the system demonstrated textbook-level rigor, with validation data from multiple clinical centers in the EU and the US, covering mainstream hardware manufacturers including Hologic, GE, Philips, Siemens, and Fujifilm, ensuring cross-platform compatibility. The validation sample was substantial in scale, comprising 7,882 cancer-free examinations and 1,240 pathologically confirmed cancer examinations. Among the cancer samples, the detailed distribution of histological types — such as invasive ductal carcinoma (60.5%), ductal carcinoma in situ (25.9%), and invasive lobular carcinoma (9.0%) — provided physicians with key reference points for assessing model confidence levels. Its quantitative metrics were excellent: accuracy of 95%, sensitivity of 94.7%, specificity of 90%, AUC as high as 0.949, and recall consistently at 92%. However, what truly embodies the spirit of "responsible" governance is the candid disclosure of technical limitations. Research shows that despite overall excellent performance, the algorithm exhibits significantly higher false-positive risk in Black patients and elderly groups aged 71 to 80 compared to the average; in contrast, performance was more stable in Asian patients and younger groups aged 41 to 50. To mitigate the impact of demographic bias, "clinical explainability analysis" must be applied. Through "Heatmaps" that highlight suspicious lesion areas of AI focus, or "Chain of Thought" presentations showing the reasoning basis of generative AI, physicians can intuitively judge whether the AI's logic aligns with medical expertise, rather than blindly accepting a numerical result. This demonstrates that a single point-in-time static evaluation is insufficient — it must extend to continuous monitoring after the system goes live to address performance degradation over time.

Dynamic Governance Mechanisms: Data Drift Monitoring and AI Full Life-Cycle Effectiveness Validation

Medical AI is not a static product; its performance may experience "Data Drift" over time, potentially caused by imaging equipment upgrades, changes in clinical pathways, or shifts in patient demographics. Therefore, AI must undergo "regular check-ups." The Ministry of Health and Welfare emphasizes that responsible governance must cover the full life cycle, with the core mechanism being the implementation of temporal drift monitoring plans. In the strategic approach to determining minimum accuracy thresholds, we adopted a quadruple composite pathway: from comprehensive literature reviews to establish baselines, to cross-disciplinary expert consensus building, supplemented by statistical lower confidence interval bounds as floor settings, and finally returning to risk analysis methods to assess the potential harm to patients from performance decline. Using a Mean Arterial Pressure (MAP) prediction model as a dynamic monitoring case study, when system monitoring indicates that data distribution begins to deviate from original training boundaries, governance mechanisms are automatically triggered. The specific execution thresholds are shown in the table below.

Operational Status	Performance Metric Threshold (Sensitivity)	Response Measures and Mechanisms
Normal Operation	≥ 90%	Continue clinical services; conduct annual evaluation by randomly sampling 300 real-world images.
Alert Triggered	< 85%	Immediately suspend AI services; initiate troubleshooting, root cause analysis, and bias correction.
Service Restart	Restored to > 90% after optimization	Resume services after automated retraining and quality management review.

Automated retraining procedures play a critical role in risk management. When performance drops to the alert threshold, the development team must collect clinical imaging data with current environmental characteristics for model optimization. This governance philosophy of embracing dynamic evolution transforms "safety" and "effectiveness" from product labels into continuous supervisory actions, ensuring that medical AI is responsible every day of its life cycle. This is not only a commitment to technical reliability but also the most substantive safety assurance for clinicians and patients, embodying a leap from static review to a dynamic resilience ecosystem.

Conclusion and Outlook: Building a 'Coaching, Not Censoring' Governance Ecosystem Aligned with International Standards

Summarizing Taiwan's journey in practicing Responsible AI, the core key to its success lies in upholding a governance philosophy of "Coaching rather than Censoring." We are keenly aware that excessive administrative review may stifle innovation. Therefore, the governance center's goal is to guide developers to embed FAVES principles from the early stages of development by providing standardized frameworks and transparency templates. In this governance ecosystem, the strategic role of the "Chief Medical AI Officer (CMAO)" is crucial — they serve as the bridge between technical teams, clinical departments, and policy regulators, ensuring that AI evolution always aligns with the core goal of healthcare quality. Taiwan also actively references international cross-sector partnership models such as Boston Children's Hospital and CHAI (Coalition for Health AI), establishing multi-stakeholder mechanisms that include peer review, LLM performance evaluation, and KPI setting, striving to become the premier global venue for best practices in medical AI governance. Looking ahead, Taiwan's achievements in medical AI governance have already gained visibility on the international stage. From the World Medical Association's (WMA) Declaration of Taipei ethical framework contributions to sharing Taiwan's experience at the World Health Assembly (WHA), Taiwan is progressively becoming a model for global medical AI governance. Technological progress should be warm and empathetic. When we embed firewalls of justice and humanistic care within automated processes, smart healthcare can truly coexist harmoniously with humanity. By transforming ethical adjectives into practical verbs, Taiwan is not only embracing cutting-edge technology but also safeguarding the humanistic spirit of healthcare, empowering AI to help physicians return to direct patient care, and laying a solid ethical and technical foundation for the next generation of smart healthcare.

center for Responsible AI in Healthcare

Figure 5. Center for Responsible AI in Healthcare

Hospital	Contact	Phone	EMAIL
Linkou Chang-Geng Memorial Hospital	HSUAN-HUI WU	+886-3-3281200 #7961	shengyuan@cgmh.org.tw
Taipei Veterans General Hospital	HSIAO-CHIEN TSAI	+886-2-2875-7835	hctsai9@vghtpe.gov.tw
Kaohsiung Medical University Chung-Ho Memorial Hospital	CHIA-YING LIN	+886-7-3121101 #5295	1120418@mail.kmuh.org.tw
China Medical University Hospital	HAN-MI,CHEN、WEN-YI WANG	+886-4-22967979 #2619 +886-4-22967979 #2614	cmuh1a80@tool.caaumed.org.tw
Changhua Christian Hospital	CHAO-WEN HUANG	+886-4-7238595#8326-8328	139781@cch.org.tw
Chi Mei Hospital	CHUNG-FENG LIU	+886-6-2812811 #52590 886-939-106-615	chungfengliu@gmail.com
Landseed International Hospital	HUA-TING LIN	+886-3-494-1234 #3039	AI.Center@landseed.com.tw
Chiayi Chang Gung Memorial Hospital	PING-CHENG TU	886-975-353-638	redviolin@cgmh.org.tw
New Taipei Municipal TuCheng Hospital	CHUN-YI WU	0965030870	wanter@cgmh.org.tw
New Taipei Jen-Kang Hospital	CHIEN-CHUNG WU	02-2215-2345 #5105	njkhservice@gmail.com

Project Team and Advisors

Project Initiator

Chien-Chang Lee, M.D., Sc.D.

Chief Information Officer (CIO) for the Taiwan Ministry of Health and Welfare (MOHW)
Professor of Emergency Medicine
Deputy Director of the Center of Intelligent Healthcare at National Taiwan University (NTU) Hospital

Acknowledgements

We sincerely acknowledge Ming-Shiang Wu, MD, PhD, Convener of the Center for Responsible AI in Healthcare and Superintendent of National Taiwan University Hospital (Taipei, Taiwan); Kenneth Mandl, MD, MPH, Professor of Pediatrics and Biomedical Informatics at Harvard University and Director of the Computational Health Informatics Program at Boston Children’s Hospital (Boston, MA, USA); and David Rhew, MD, Global Chief Medical Officer (CMO) and Vice President of Healthcare at Microsoft (New York, NY, USA), as distinguished international experts for their invaluable professional guidance and contributions to our initiative in advancing responsible AI in healthcare.

Ming-Shiang Wu, MD, PhD

Convener of the Center for Responsible AI in Healthcare
Superintendent of National Taiwan University Hospital

Kenneth Mandl, MD, MPH

Donald A.B. Lindberg Professor of Pediatrics and Biomedical Informatics at Harvard University
Director of the Computational Health Informatics Program at Boston Children's Hospital

David Rhew, MD

Global Chief Medical Officer (CMO) and VP of Healthcare for Microsoft

National Clinical AI Application Registration Platform

The platform pioneers an innovative post-market surveillance (PMS) framework, establishing a closed-loop, dynamic lifecycle management ecosystem for clinical AI. Medical AI is not a static product; its performance may experience "Data Drift" over time, potentially caused by imaging equipment upgrades, changes in clinical pathways, or shifts in patient demographics. Therefore, AI must undergo "regular check-ups." The Ministry of Health and Welfare emphasizes that responsible governance must cover the full life cycle, with the core mechanism being the implementation of temporal drift monitoring plans.