Abstract
The rapid integration of AI into high-stakes decision-making has outpaced traditional mechanisms for human oversight and accountability, leaving leaders without clear guidance on how to leverage algorithmic systems responsibly. To address this gap, we conducted a comparative qualitative study of four landmark AI deployments: the UK A-Level grading algorithm used during the COVID-19 pandemic, Amazon’s automated hiring tool, the COMPAS recidivism risk score in the U.S. criminal justice system, and the Dutch SyRI welfare-fraud detection system. Drawing on 61 publicly available government reports, internal memos, and media articles, we applied a rigorous two-phase grounded-theory coding process in NVivo, producing a comprehensive 32-item codebook and achieving substantial inter-coder reliability. We then quantified thematic occurrences across 110 coded segments and conducted chi-square tests to confirm consistent application of themes across cases. Our analysis yielded four actionable principles: 1) Intentionality—leaders must consciously elect to involve AI rather than default to automation; 2) Interpretability—systems should provide accessible explanations for bias detection and decision justification; 3) Moral Authorship—human actors must explicitly claim ultimate responsibility for outcomes; 4) Justice—delegation structures must be designed to prevent the perpetuation of existing inequities. Together, these principles form a reproducible analytical roadmap and offer practical guidance for accountable AI governance in high-stakes contexts.
Keywords
Main Subjects