AI-Enhanced Healthcare Data Quality Governance: An Integrated Approach for Anomaly Detection and Integrity Verification

Main Article Content

Yisi Liu

Abstract

Healthcare data quality remains a critical challenge affecting clinical decision-making, patient safety, and operational efficiency across medical institutions. This paper presents an integrated approach for AI-enhanced healthcare data quality governance that combines rule-based anomaly detection, statistical scoring mechanisms, and temporal consistency verification. The proposed framework establishes hierarchical quality checkpoints across heterogeneous EHR tables and clinical documentation streams (and is extendable to multi-source settings), enabling real-time identification of data entry errors, logical conflicts, and distribution drift patterns. Through systematic evaluation on the MIMIC-III EHR dataset (53,423 ICU admissions; >50,000 ICU admission records) using proxy anomaly labels derived from rule violations and cross-field/temporal consistency checks (with controlled synthetic anomaly injections for robustness testing), our approach achieves 94.7% detection accuracy with a false-positive rate of 3.2%. The experimental results validate the effectiveness of the integrated governance methodology in maintaining data integrity across diverse clinical scenarios while providing interpretable evidence chains for healthcare practitioners.

Article Details

Section

Articles

How to Cite

AI-Enhanced Healthcare Data Quality Governance: An Integrated Approach for Anomaly Detection and Integrity Verification. (2026). Journal of Sustainability, Policy, and Practice, 2(1), 215-229. http://schoalrx.com/index.php/jspp/article/view/95

References

1. A. Kore, E. A. Bavil, V. Subasri, M. Abdalla, B. Fine, E. Dolatabadi, and M. Abdalla, "Empirical data drift detection experiments on real-world medical imaging data," Nature Communications, vol. 15, p. 1887, 2024.

2. M. Afkanpour, E. Hosseinzadeh, and H. Tabesh, "Identify the most appropriate imputation method for handling missing values in clinical structured datasets: A systematic review," BMC Medical Research Methodology, vol. 24, p. 188, 2024. doi: 10.1186/s12874-024-02310-6

3. M. Tabassum, S. Mahmood, A. Bukhari, B. Alshemaimri, A. Daud, and F. Khalique, "Anomaly-based threat detection in smart health using machine learning," BMC Medical Informatics and Decision Making, vol. 24, p. 347, 2024. doi: 10.1186/s12911-024-02760-4

4. N. G. Weiskopf, S. Bakken, G. Hripcsak, and C. Weng, "Electronic health record data quality assessment and tools: A systematic review," Journal of the American Medical Informatics Association, vol. 30, no. 10, pp. 1730-1740, 2023.

5. S. Prathapan, R. K. Samala, N. Hadjiyski, P. F. D'Haese, F. Maldonado, P. Nguyen, Y. Yesha, and B. Sahiner, "Quantifying input data drift in medical machine learning models by detecting change-points in time-series data," In Proceedings of SPIE Medical Imaging 2024: Computer-Aided Diagnosis, 2024, p. 129270.

6. D. Samariya, S. Aryal, K. M. Ting, and J. Ma, "Detection and explanation of anomalies in healthcare data," Health Information Science and Systems, vol. 11, p. 20, 2023.

7. Y. Rotalinti, A. Tucker, M. Lonergan, P. Myles, and R. Branson, "Detecting drift in healthcare AI models based on data availability," In Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2023, pp. 248-263.

8. M. M. Khan, and M. Alkhathami, "Anomaly detection in IoT-based healthcare: Machine learning for enhanced security," Scientific Reports, vol. 14, p. 5872, 2024. doi: 10.1038/s41598-024-56126-x

9. Y. P. Penev, T. R. Buchanan, M. M. Ruppert, M. Liu, R. Shekouhi, Z. Guan, J. Balch, T. Ozrazgat-Baslanti, B. Shickel, T. J. Loftus, and A. Bihorac, "Electronic health record data quality and performance assessments: Scoping review," JMIR Medical Informatics, vol. 12, p. e58130, 2024.

10. K. D. Mandl, D. Gottlieb, and A. Ellis, "Bridging the past and future of clinical data management: The transformative impact of artificial intelligence," Open Access Journal of Clinical Trials, vol. 16, pp. 15-33, 2024.

11. Y. Chen, L. Wang, and H. Zhang, "Smart data-driven medical decisions through collective and individual anomaly detection in healthcare time series," International Journal of Medical Informatics, vol. 192, p. 105628, 2024.

12. I. Kowsar, S. B. Rabbani, and M. D. Samad, "Attention-based imputation of missing values in electronic health records tabular data," In Proceedings of the IEEE International Conference on Healthcare Informatics, 2024, pp. 177-182. doi: 10.1109/ichi61247.2024.00030

13. A. Vaid, K. W. Johnson, and G. N. Nadkarni, "Data drift in medical machine learning: Implications and potential remedies," British Journal of Radiology, vol. 96, no. 1150, p. 20220878, 2023.

14. M. Kazijevs, and M. D. Samad, "Deep imputation of missing values in time series health data: A review with benchmarking," Journal of Biomedical Informatics, vol. 144, p. 104440, 2023. doi: 10.1016/j.jbi.2023.104440

15. J. Yoon, J. Jordon, and M. van der Schaar, "Handling missing values in healthcare data: A systematic review of deep learning-based imputation techniques," Artificial Intelligence in Medicine, vol. 140, p. 102546, 2023.