This is part 2 of a two-part series sharing Op-Eds I wrote for my Law 432.D course titled “Accountable Computer Systems.” This blog will likely go up on the course website in the near future but as I am hoping to speak to and reference things I have written for a presentations coming up, I am sharing here, first. This blog discusses the hot topic of ‘humans in the loop’ for automated decision-making systems [ADM]. As you will see from this Op-Ed, I am quite critical of our current Canadian Government self-regulatory regime’s treatment of this concept.
As a side note, there’s a fantastic new resource called TAG (Tracking Automated Government) that I would suggest those researching this space add to their bookmarks. I found it on X/Twitter through Professor Jennifer Raso’s post. For those that are also more new to the space or coming from it through immigraiton, Jennifer Raso’s research on automated decision-making, particularly in the context of administrative law and frontline decision-makers is exceptional. We are leaning on her research as we develop our own work in the immigration space.
Without further ado, here is the Op-Ed.
The DADM’s Noticeable Silence: Clarifying the Human Role in the Canadian Government’s Hybrid Decision-Making Systems[i]
Who are the humans involved in hybrid automated decision-making (“ADM”)? Are they placed into the system (or loop) to provide justification for the machine’s decisions? Are they there to assume legal liability? Or are they merely there to ensure humans still have a job to do?
Effectively regulating hybrid ADM systems requires an understanding of the various roles played by the humans in the loop and clarity as to the policymaker’s intentions when placing them there. This is the argument made by Rebecca Crootof et al. in their article, “Humans in the Loop” recently published in the Vanderbilt Law Review.[ii]
In this Op-Ed, I discuss the nine roles that humans play in hybrid decision-making loops as identified by Crootof et al. I then turn to my central focus, reviewing Canada’s Directive on Automated Decision-Making (“DADM”)[iii] for its discussion of human intervention and humans in the loop to suggest that Canada’s main Government self-regulatory AI governance tool not only falls short, but supports an approach of silence towards the role of humans in Government ADMs.
What is a Hybrid Decision-Making System? What is a Human in the Loop?
A hybrid decision-making system is one where machine and human actors interact to render a decision.[iv]
The oft-used regulatory definition of humans in the loop is “an individual who is involved in a single, particular decision made in conjunction with an algorithm.[v] Hybrid systems are purportedly differentiable from “human off the loop” systems, where the processes are entirely automated and humans have no ability to intervene in the decision.[vi]
Crootof et al. challenges the regulatory definition and understanding, labelling it as misleading as its “focus on individual decision-making obscures the role of humans everywhere in ADMs.”[vii] They suggest instead that machines themselves cannot exist or operate independent from humans and therefore that regulators must take a broader definition and framework for what constitutes a system’s tasks.[viii] Their definition concludes that each human in the loop, embedded in an organization, constitutes a “human in the loop of complex socio-technical systems for regulators to target.”[ix]
In discussing the law of the loop, Crootof et al. expresses the numerous ways in which the law requires, encourages, discourages, and even prohibits humans in the loop. [x]
Crootof et al. then labels the MABA-MABA (Men Are Better At, Machines Are Better At) trap,[xi] a common policymaker position that erroneously assumes the best of both worlds in the division of roles between humans and machines, without consideration how they can also amplify each other’s weaknesses.[xii] Crootof et al. finds that the myopic MABA-MABA “obscures the larger, more important regulatory question animating calls to retain human involvement in decision-making.”
As Crootof et al. summarizes:
“Namely, what do we want humans in the loop to do? If we don’t know what the human is intended to do, it’s impossible to assess whether a human is improving a system’s performance or whether regulation has accomplished its goals by adding a human”[xiii]
Crootof et al.’s Nine Roles for Humans in the Loop and Recommendations for Policymakers
Crootof sets out nine, non-exhaustive but illustrative roles for humans in the loop. These roles are: (1) corrective; (2) resilience; (3) justificatory; (4) dignitary; (5) accountability; (6) Stand-In; (7) Friction; (8) Warm-Body; and (9) Interface.[xiv] For ease of summary, they have been briefly described in a table attached as an appendix to this Op-Ed.
Crootof et al. discusses how these nine roles are not mutually exclusive and indeed humans can play many of them at the same time.[xv]
One of Crootof et al.’s three main recommendations is that policymakers should be intentional and clear about what roles the humans in the loop serve.[xvi] In another recommendation they suggest that the context matters with respect to the role’s complexity, the aims of regulators, and the ability to regulate ADMs only when those complex roles are known.[xvii]
Applying this to the EU Artificial Intelligence Act (as it then was[xviii]) [“EU AI Act”], Crootof et al. is critical of how the Act separates the human roles of providers and users, leaving nobody responsible for the human-machine system as a whole.[xix] Crootof et al. ultimately highlights a core challenge of the EU AI Act and other laws – how to “verify and validate that the human is accomplishing the desired goals” especially in light of the EU AI Act’s vague goals.
Having briefly summarized Crootof et al.’s position, the remainder of this Op-Ed ties together a key Canadian regulatory framework, the DADM’s, silence around this question of the human role that Crootof et al. raises.
The Missing Humans in the Loop in the Directive on Automated Decision-Making and Algorithmic Impact Assessment Process
Directive on Automated Decision-Making
Canada’s DADM and its companion tool, the Algorithmic Impact Assessment (“AIA”), are soft-law[xx] policies aimed at ensuring that “automated decision-making systems are deployed in a manner that reduces risks to clients, federal institutions and Canadian Society and leads to more efficient, accurate, and interpretable decision made pursuant to Canadian law.”[xxi]
One of the areas addressed in both the DADM and AIA is that of human intervention in Canadian Government ADMs. The DADM states:[xxii]
Ensuring human intervention
6.3.11
Ensuring that the automated decision system allows for human intervention, when appropriate, as prescribed in Appendix C.
6.3.12
Obtaining the appropriate level of approvals prior to the production of an automated decision system, as prescribed in Appendix C.
Per Appendix C of the DADM, the requirement for a human in the loop depends on the self-assessed impact level scoring system to the AIA by the agency itself. For level 1 and 2 (low and moderate impact)[xxiii] projects, there is no requirement for a human in the loop, let alone any explanation of the human intervention points (see table below extracted from the DADM).
I would argue that to avoid explaining further about human intervention, which would then engage explaining the role of the humans in making the decision, it is easier for the agency to self-assess (score) a project as one of low to moderate impact. The AIA creates limited barriers nor a non-arms length review mechanism to prevent an agency strategically self-scoring a project below the high impact threshold.[xxiv]
Looking at the published AIAs themselves, this concern of the agency being able to avoid discussing the human in the loop appears to play out in practice.[xxv] Of the fifteen published AIAs, fourteen of them are self-declared as moderate impact with only one declared as little-to-no impact. Yet, these AIAs are situated in high-impact areas such as mental health benefits, access to information, and immigration.[xxvi] Each of the AIAs contain the same standard language terminology that a human in the loop is not required.[xxvii]
In the AIA for the Advanced Analytics Triage of Overseas Temporary Resident Visa Applications, for example, IRCC further rationalizes that “All applications are subject to review by an IRCC officer for admissibility and final decision on the application.”[xxviii] This seems to engage that a human officer plays a corrective role, but this not explicitly spelled out. Indeed, it is open to contestation from critics who see the Officer role as more as a rubber-stamp (dignitary) role subject to the influence of automation bias.[xxix]
Recommendation: Requiring Policymakers to Disclose and Discuss the Role of the Humans in the Loop
While I have fundamental concerns with the DADM itself lacking any regulatory teeth, lacking the input of public stakeholders through a comment and due process challenge period,[xxx] and driven by efficiency interests,[xxxi] I will set aside those concerns for a tangible recommendation for the current DADM and AIA process.[xxxii]
I would suggest that beyond the question around impact, in all cases of hybrid systems where a human will be involved in ADMs, there needs to be a detailed explanation provided by the policymaker of what roles these humans will play. While I am not naïve to the fact that policymakers will not proactively admit to engaging a “warm body” or “stand-in” human in the loop, it at least starts a shared dialogue and puts some onus on the policymaker to both consider proving, but also disproving a particular role that it may be assigning.
The specific recommendation I have is to require as part of an AIA, a detailed human capital/resources plan that requires the Government agency to identify and explain the roles of the humans in the entire ADM lifecycle, from initiation to completion.
This idea also seems consistent with best practices in our key neighbouring jurisdiction, the United States. On 28 March 2024, a U.S. Presidential Memorandum aimed at Federal Agencies titled “Advancing Governance, Innovation, and Risk Management for Agency Use of Artificial Intelligence aimed at U.S. Federal Agencies was released. The Presidential Memorandum devotes significant attention to defining and assigning specific roles for humans in the AI governance structure.[xxxiii]
This level of granularity and specificity is in sharp contrast and direct counterpoint to the lack of discussion of the roles of humans found in the DADM and AIA.[xxxiv]
Conclusion: The DADM’s Flexibility Supports Incorporating Changes
Ultimately, proactive disclosure and a more fulsome discussion of human roles will increase transparency and an added layer of accountability for ADMs that regulators can hold Government policymakers to, and at worst encourage them to proactively discuss.
Luckily, and as a positive note to end this Op-Ed, the DADM is not a static document. It undergoes regular reviews every two-years.[xxxv]
The malleability of soft law and the openness of the Treasury Board Secretariat to engage in dialogue with stakeholders,[xxxvi] allows for a policy gap such as the one I have identified to be addressed quicker than if it were found in legislative concerns.
While the present silence around the role of humans in the DADM is disconcerting, it is by no means to late to add human creativity and criticalness, in a corrective role, to this policy-design loop.
Appendix: Crootof’s Nine Roles
Role | Definition |
A. Corrective Roles | “Improve system performance” (Crootof et al. at 473) for the purposes of accuracy (Crootof et al. at 475)
Three types: 1) Error Correction – fix algorithmic errors (Crootof et al. at 475-476) 2) Situational Correction – “improve system’s outputs by tailoring algorithm’s recommendation based on population-level data to individual circumstances” (Crootof et al. at 467); deploy principles of equity, when particularities of human experience outrun our (or in this case, machine’s ability) to make general rules. (Crootof et al. at 477) 3) Bias Correction – “identify and counteract algorithmic biases” (Crootof et al. at 477) |
B. Resilience[xxxvii] Roles | “act as a failure mode or alternatively stop the system from working under an emergency” (Crootof et al. at 473)
“Acting as a backstop when automated system malfunctions or breaks down” (Crootof et al. at 478) |
C. Justificatory Roles | “increase system’s legitimacy by providing reasoning for decisions” (Crootof et al. at 473)
“make a decision appear more legitimate, regardless of whether or not they provide accurate or salubrious justifications” (Crootof et al. at 479) |
D. Dignitary Roles | “protect the dignity of the humans affected by the decision” (Crootof et al. at 473)
Combat and avoid objectification to alleviate dignitary concerns (Crootof et al. at 481) |
E. Accountability Roles | “allocate liability of censure” (Crootof et al. at 473-474)
“ensure someone is legally liable, morally responsible, or otherwise accountable for a system’s decisions.” (Crootof et al. at 482) |
F. Stand-In Roles | “act as proof that something has been done or stand in for humans and human values” (Crootof et al. at 474)
“demonstrating that, just in case of something wrong with automation, something has been done” (Crootof et al. at 484) |
G. Friction Roles | “slow down the pace of automated decision-making” (Crootof et al. at 474)
“make them more idiosyncratic and less interoperable” (Crootof et al. at 484)
|
H. “Warm Body” Roles | “preserve human jobs” (Crootof et al. at 474)
augmented rather than artificial intelligence – importance of keeping humans involved in processes (Crootof et al. at 486) “Prioritizes the worth of individual humans in the loop, rather than the humans on which the algorithmic system acts” (Crootof et al. at 486) |
I. Interface Roles | “link the systems to human users” (Crootof et al. at 474)
“helping users interact with an algorithmic system” (Crootof et al. at 487)
“Making system more intuitive for the results more palatable” (Crootof et al. at 487) |
References
[i] The author would like to thank the helpful reviewers for their feedback and comments. This final Op-Ed version was revised for written expression and clarity, to add a more explicit and concrete recommendation for how to add humans into the loop into the Canadian Algorithmic Impact Assessment (“AIA”) regime, and to discuss further AIA literature on both Canadian governance and global best practices. Due to word count limitations, much of this additional discussion occurs in the footnotes. Furthermore, with time constraints, I was unable to go fully into one reviewer’s questions to look at other jurisdictions for their AIAs but have noted this as a future academic research topic. I do briefly reference a U.S. AI Governance policy that was released on 28 March 2024 (a day before the paper’s deadline) as a point of reference for my Op-Ed’s core recommendation.
[ii] Rebecca Crootof, Margot Kaminski & W. Price, “Humans in the Loop” (2023) 76:2 Vanderbilt Law Review 429 Accessed online: <https://cdn.vanderbilt.edu/vu-wordpress-0/wp-content/uploads/sites/278/2023/03/19121604/Humans-in-the-Loop.pdf>. See: Crootof et al. at 487 for summary of this argument.
[iii] Treasury Board Secretariat, Directive on Automated Decision-Making (Ottawa: Treasury Board Secretariat, 2019) Accessed online: <https://www.tbs-sct.canada.ca/pol/doc-eng.aspx?id=32592> (Last modified: 25 April 2023)
[iv] Therese Enarsson, Lena Enqvist, and Markus Naarttijärvi (2022) “Approaching the human in the loop – legal perspectives on hybrid human/algorithmic decision-making in three contexts”, Information and Communications Technology Law, 31:1, 123-153, DOI: 10.1080/13600834.2021.1958860 at 126; Crootof et al. at 434.
[v] Crootof et al. at 440. They discuss that this definition captures systems where individuals apply discretion in their use algorithmic systems to reach a particular decision, where individual and algorithmic systems pass off or perform tasks in concert, and arguably even when a system requires a human review of an automated decision.
[vi] Crootof et al. at 441. This is just one example of the contrasting approaches in defining humans in the loop, Enarsson et al. highlights that in a human in the loop system there is capability for human intervention in every decision cycle of the system, and in a human off the loop system, humans intervene through design and monitoring of the system. They also define a third type, HIC systems or meta-autonomy, where human exercise control and oversight to implement an otherwise fully automated system. Consider also, ‘human intervention’ and ‘humans in the loop’ are the terms utilized by the Government’s DADM but not defined as such.
[vii] Crootof et al. at 445.
[viii] Crootof et al. at 444.
[ix] Crootof et al at 444.
[x] Crootof et al. at 446-459. They provide examples of where the same interest such as regulatory arbitrage could be fostered in some cases (e.g. allowing a human-reviewed medical software program to escape higher regulatory requirements), but in other circumstances discouraged (for example, for National Security Agency minimizing human oversight of surveilled materials to avoid it classifying as a search.
[xi] Crootof et al. at 437.
[xii] Crootof et al. at 467.
[xiii] Crootof et al. at 473.
[xiv] Crootof et al. at 473-474.
[xv] Crootof et al. at 473.
[xvi] Crootof et al. at 487.
[xvii] Crootof et al. at 492.
[xviii] The final EU AI Act at article 14 appears to contain the same discussion of human oversight in article 14 as was discussed and critiqued by Crootof including the separation of user and provider human roles. See: EU AI Act, Accessed online: <https://artificialintelligenceact.eu/article/14/>
[xix] Crootof et al. at 503-504.
[xx] “The term soft law is used to denote agreements, principles and declarations that are not legally binding.” See: European Center for Constitutional and Human Rights, “Definition: Hard law/soft law”, Accessed online: <https://www.ecchr.eu/en/glossary/hard-law-soft-law/>
[xxi] Specifically, the DADM mandates that a Canadian Government agency complete and release the final results of an AIA prior to producing any automated decision system. These AIAs also have to be reviewed and updated as the functionality and scope of the ADM system changes.
[xxii] DADM at Appendix C.
[xxiii] DADM at Appendix B.
[xxiv] In response to one of my reviewers asking about whether there is an agency whose role is/should be to control declarations that do not play their part, in Canada the Treasury Board Secretariat (“TBS”) is the agency that administers the AI. In theory, it is supposed to play this role. However, I have obtained several copies of email correspondence through Access to Information requests pertaining to AIA approval where reviews and feedback occurred merely over the course of a few exchanged emails.
[xxv]In response to a reviewer’s question, my very argument is that the AIA’s structure allows entities to avoid having to discuss the role of humans in the loop as to date they have all self-assessed projects as medium-risk at the highest level, therefore allowing projects to not even require a human in the loop, let alone discuss the roles of the humans in the loop. The major negative impact I see in the silence on the roles is both a lack of transparency and accountability, but the possibility that humans are viewed simply as a justificatory safeguard for ADMs without further interrogation. I agree with Crootof et al. that there could be other roles not contemplated in the list of nine, but that this list does provide a starting point for debate and discussion between the policymaker drafting the AIA and the TBS/peer reviewer/reviewing public.
[xxvi] Interesting to contrast the EU AI Act, “Annex 3 – Annex III: High-Risk AI Systems Referred to in Article 6(2)” Accessed online: <https://artificialintelligenceact.eu/annex/3/> classifies migration management as high-risk at 7(d).
[xxvii] See: Canada, Treasury Board Secretariat, Algorithmic Impact Assessment tool (Ottawa: modified 25 April 2023). Published AIAs can be accessed here: <https://search.open.canada.ca/opendata/?sort=metadata_modified+desc&search_text=Algorithmic+Impact+Assessment&page=1>
[xxviii] Government of Canada, Open Government, “Algorithmic Impact Assessment – Advanced Analytics Triage of Overseas Temporary Resident Visa Applications – AIA for the Advanced Analytics Triage of Overseas Temporary Resident Visa Applications (English).” 21 January 2022. Accessed online: <https://open.canada.ca/data/dataset/6cba99b1-ea2c-4f8a-b954-3843ecd3a7f0/resource/9f4dea84-e7ca-47ae-8b14-0fe2becfe6db/download/trv-aia-en-may-2022.pdf> (Last Modified: 29 November 2023)
[xxix] Crootof et al. at 482. It is noted that only recently has there been the acknowledgment of automation bias in a publicly posted document. See: Government of Canada, Open Government, “International Experience Canada Work Permit Eligibility Model – Peer Review of the International Experience Canada Work Permit Eligibility Model – Executive Summary (English)“ 9 February 2024. Accessed online: <https://open.canada.ca/data/dataset/b4a417f7-5040-4328-9863-bb8bbb8568c3/resource/e8f6ca6b-c2ca-42c7-97d8-394fe7663382/download/peer-review-of-the-international-experience-canada-work-permit-eligibility-model-executive-summa.pdf>
[xxx] This is a best practice identified by Reisman et al. in “Algorithmic Impact Assessments: A Practical Framework for Public Agency Accountability” AI Now, April 2018 at 9-10 Accessed online: <https://ainowinstitute.org/publication/algorithmic-impact-assessments-report-2>. The DADM and AIA currently does not have a public comment process.
[xxxi] Crootof et al at 454-455.
[xxxii] Per the DADM at 1.3, there is a mandated two-year review period. The DADM completed its third review in April 2023.
[xxxiii]Shalanda D. Young, Memorandum for the Heads of Executive Department and Agencies, “Subject: Advancing Governance, Innovation, and Risk management for Agency Use of Artificial Intelligence” Executive Office of the President of the United States – Office of Management and Budget, 28 March 2024 at 4-5, 18, 26. Accessed online: <https://www.whitehouse.gov/wp-content/uploads/2024/03/M-24-10-Advancing-Governance-Innovation-and-Risk-Management-for-Agency-Use-of-Artificial-Intelligence.pdf>
[xxxiv] The Presidential Memorandum appears more broadly aimed at the use of AI and AI Governance by U.S. agencies and specifically as part of an ADM approval processes. It does raise the question of whether discussion of humans in the loop or AI Governance should be built into the DADM and each AI or instead be a standalone policy that applies to all Government AI projects. My suggestion would be for both a broader directive and requirements for AIAs to specifically discuss compliance with reference to the broader directive.
[xxxv] The review period was initially set in the DADM at six months but recently expanded to two-years. I would argue that with the AIA still containing many gaps, a six-month review might still be beneficial. See: DADM 3rd Review Summary, One Pager, Fall 2022, Accessed online: <https://wiki.gccollab.ca/images/2/28/DADM_3rd_Review_Summary_%28one-pager%29_%28EN%29.pdf?utm_source=Nutshell&utm_campaign=Next_Steps_on_TBSs_Consultation_on_the_Directive_on_Automated_Decision_Making&utm_medium=email&utm_content=Council_Debrief__July_2021>
[xxxvi] I have been in frequent online dialogue with Director of Data and Artificial Intelligence, Treasury Board of Canada Secretariat, Benoit Deshaies, who has often responded to both my public tweets and private messages with questions about the AIA’s process. See e.g.: a recent Twitter/X exchange https://twitter.com/MrDeshaies/status/1769703302730596832
[xxxvii] Crootof et al. at 478 defines this as “complex system to withstand failure by minimizing the harms from bad outcomes.”