Peering into the 'Double Black Box' of National Security and AI

10 hours ago 1

Ashley Deeks’s “The Double Black Box” appears at an opportune time. National security agencies are racing to find ways to integrate machine learning (ML) and other forms of artificial intelligence (AI) into their work, most of which is conducted in classified settings. In this context, Deeks seeks to answer the critical question: “[H]ow do we hold the executive branch accountable for its use of AI in national security settings?”

The book amply demonstrates Deeks’s central thesis: that the opaque nature of AI combined with the secrecy endemic to national security programs creates a “double black box.” The result, Deeks explains, is that the traditional actors that are meant to serve as checks on classified executive branch policymaking—such as Congress, the courts, executive branch lawyers, and inspectors general—will have an increasingly hard time doing so. She pulls together a useful overview of several known uses of AI and ML tools in national security activities, highlighting two instances in which public pushback led to a change in agency plans. This is followed by a synopsis of the most common critiques of AI: the use of biased data that leads to incorrect outcomes; the difficulty determining whom to hold accountable when AI-based decisions turn out to be wrong; the tendency of human decision-makers to defer to automated outcomes; and the lack of transparency about how these systems produce decisions or results.

While others have made the point that national security secrecy and AI opacity creates serious challenges for oversight, Deeks contributes significantly to our understanding of the issue by delving into how exactly the double black box impedes accountability in practice. Informed by her deep knowledge of the workings of the executive branch, she details how AI will increasingly hamper government lawyers’ ability to evaluate the legality of AI-powered operations. For example, lawyers will struggle to trace the data being used to train national security algorithms and even more so when the training is undertaken by private contractors. The national security AI double black box also undermines Congress’s ability to serve as a check on the executive branch. Legislators, for example, will not necessarily know what causes AI-powered weapons systems to fail, the degree of autonomy incorporated into these systems, and the role of autonomous systems in causing collateral damage, or whether intelligence officials produced a key intelligence report using AI tools. As a result, Deeks explains, Congress may not even know which national security incidents to investigate further. Moreover, as Deeks points out, Congress may have little interest in playing its oversight role given the prevalent bipartisan consensus that AI is key to winning a new cold war with China. Using the example of autonomous cyber operations, Deeks demonstrates effectively how easily these tools could lead states to engage in unintended hostile cyber acts resulting in an armed conflict, leaving Congress on its back foot.

In Part II of the book, Deeks pivots to addressing the oversight challenges created by the double black box. She focuses primarily on Congress, proposing a framework statute that would structure and restrict the executive branch’s use of high-risk national security AI systems; impose reporting requirements akin to those that currently exist for covert action and risky offensive cyber operations; establish a notification requirement for certain national security AI decisions, similar to the one mandated by the War Powers Resolution; prohibit AI in nuclear command and control; and establish a Joint Committee on AI.

Reporting to Congress on national security AI use would be an improvement over the status quo. But important questions remain about how the war powers and covert action models could be adapted to cover AI systems. These systems are trained on large volumes of data and themselves generate a multitude of insights and outcomes. Should Congress be privy to any or all of this information, or an inventory of this information, to conduct meaningful oversight? Would legislators need access to case studies or simulations of how these systems work? Do members of Congress and their staffs have the expertise to evaluate this information even if it were provided? Moreover, these notification models presume a discrete set of identifiable actions. In contrast, AI systems, such as those used to support command and control in the battlefield, may always be running in the background 24/7, and Congress will have to establish criteria for the types of AI-facilitated decisions or incidents about which it should be notified.

Deeks makes an important call for “radical transparency” by the executive branch about its intended use of national security AI, citing the Defense Department’s directive on autonomy in weapons systems as an example. The military should, in her view, explain to the public why it is using AI in decision making, how it will ensure that its use of predictive algorithms is consistent with international law, and potentially explain how it tests data, avoids training algorithms on biased data, and trains users to avoid automation bias. Depending on the level of detail included in such explanations, they could either be boilerplate or provide valuable information that advances the public’s understanding. Given national security agencies’ penchant for secrecy, I am not optimistic. These recommendations could also be extended to the intelligence community’s use of AI, which includes social media monitoring and facial recognition programs run by the FBI and the Department of Homeland Security that have direct consequences for Americans’ constitutional rights.

High-level transparency of the type that Deeks has identified could be coupled with more specific information, especially for national security AI that affects the rights of people in the United States and Americans overseas. Earlier in the book, Deeks notes that Congress has on occasion required the executive branch to release national security information, such as by mandating declassification review of decisions of the Foreign Intelligence Surveillance Court (FISC) that include novel or significant interpretations of the law. Through these declassified FISC opinions (parts of which are typically redacted), the public has gained an understanding of how certain surveillance programs operate and whether the agency has complied with applicable rules and court orders. We have thus been in a much better position to assess the legality of some surveillance efforts. The fact that the government has been able to release this information about programs that were long hidden from public view suggests that it can find ways to provide more information to the public about national security AI than is commonly assumed. Requiring agencies to put together more detailed disclosures would also strengthen the hand of internal overseers who would almost certainly have to be involved in developing the disclosures.

An additional transparency pathway would build on the October 2024 National Security Memorandum (NSM) that requires agencies to create annual inventories of national security AI systems that have a “high impact” on rights and safety. There is no provision for these to be made public, but the agencies could create (or Congress could mandate) declassification review and/or the publication of unclassified summaries. Domestically focused agencies, such as Homeland Security and the FBI, already provide similar disclosures about their use of some AI systems, both through the inventories required by the Office of Management and Budget rules for non-national security systems and through privacy impact assessments. The October 2024 NSM also requires AI oversight personnel to submit annual reports about their work to the heads of their agencies, with unclassified versions made available to the public to the “greatest extent practicable.” Agencies should be encouraged to release the maximum amount of information via these reports, rather than defaulting to secrecy.

In the first part of the book, Deeks highlights the role of agencies’ internal oversight mechanisms, such as civil rights and privacy offices. These sources of potential oversight could be shored up and strengthened. The October 2024 NSM places significant AI oversight responsibilities on privacy and civil liberties offices, but historically they have not had the access and resources needed to fulfill their critical role.

Elevating the stature of these internal checks through congressional mandates for the offices (in addition to the officers whose positions are mandated by statute) and increased funding (potentially tied to an agency’s AI spending) may help them play a more robust role. Chief AI officers—who have now been appointed at all major national security agencies—provide another potential avenue to oversee AI. These officers are dual-hatted. They are charged with promoting AI use within the agency and ensuring that these tools are well tested and protective of rights and privacy. These functions could be separated, although that might risk siloing oversight functions from other operational activities in ways that sideline the former. Given that the current administration is making sweeping staff and budget cuts to oversight personnel, however, it is imperative that Congress mandate minimum staffing levels and ensure these offices have staff dedicated to evaluating AI, including for bias, to ensure that mission-oriented duties do not result in other values getting short shrift. Congress could also create direct reporting lines so that oversight personnel are answerable to the relevant committees that oversee the agencies.

Another avenue for transparency and oversight suggested by Deeks is the creation of an expert oversight board, potentially modeled on the Privacy and Civil Liberties Oversight Board (PCLOB). This type of oversight—within the executive branch, but independent, and with access to classified information—is critical. The PCLOB has already taken some steps to review programs that involve AI (for example, it recently issued a report on the Transportation Security Administration’s use of facial recognition technologies, which incorporates AI), but the board’s mission and resources would need to be expanded significantly for it to serve as a check across the range of uses of national security AI.

Deeks posits that the FISC would require information about the use of AI systems when the government is seeking authorization for foreign intelligence surveillance of individuals under Title I of the Foreign Intelligence Surveillance Act (FISA). FISC could certainly serve as something of a check, but it is not clear that it would, as Deeks suggests, require information on “the data on which [AI systems] were trained, the parameters they contain, and their error rates.” Congress could, however, mandate such disclosures as part of the application process. FISC supervision could be extended beyond individualized Title I surveillance orders to other programs under FISA, most notably Section 702, which allows the government to conduct warrantless surveillance targeting foreigners overseas—including the foreign person or entity’s communications with Americans in the United States. No person-specific FISC orders are required for this surveillance, which in 2024 targeted communications from nearly 300,000 people. However, FISC’s periodic reviews of the ground rules for Section 702 surveillance provide an opportunity for the court also to evaluate and regulate the use of AI in the program. To increase transparency, Congress could require declassification review of key FISC opinions relating to the use of AI in surveillance.

As Deeks points out, “the only actor that has imposed substantive rules or procedures on the Executive’s development and use of national security AI is the Executive itself.” This reader wanted to know what Deeks thinks of these efforts. For example, the Defense Department’s 2023 directive on autonomous and semi-autonomous systems reiterates the department’s policy that commanders and operators must “exercise appropriate levels of human judgment over the use of force” but provides no guidance on what is or who defines “appropriate.” Nor does it address who would be held accountable (or how accountability will be determined) if these systems cause damage that violates the laws of war.

Deeks also explores “nontraditional checks” on executive branch national security decisions, such as technology companies, foreign allies, and state and local governments. With respect to tech companies, it is hard to see how they are likely to constrain government action, given that, as Deeks points out, their involvement “exacerbate[s]” the double black box problem” and they “could be careless or untruthful about the training data they use, misrepresent the efficacy and reliability of their systems, unintentionally embed biases in their systems, or resist sharing their data or algorithms with the government.” AI companies’ full-throated push for the integration of AI as a national security imperative, and their frequent warnings that regulation would hamper innovation, makes it unlikely that they will serve as a check on national security agencies.

Finally, Deeks explains the limits of international efforts in mitigating the double black box because there is little consensus on what rules, if any, are appropriate. Deeks is probably right to be skeptical of the possibility of transnational agreement on national security AI. The United States has been firmly opposed to international efforts to prohibit fully autonomous weapons systems, and even countries that support prohibitions have offered only vague commitments.

But establishing these types of norms often takes decades of painstaking negotiations and pressure. It took 18 years after the adoption of the Universal Declaration of Human Rights for the UN General Assembly to approve the draft text of the International Covenant on Civil and Political Rights and another 10 years for the sufficient number of states to sign on for the treaty to come into force. The hard work of building pressure on governments to commit to AI guardrails and fostering dialogue is happening now. The civil society campaign against “killer robots,” the convening of high-level summits on AI safety, and the UN secretary-general’s informal consultations with states on lethal autonomous weapons—all of these contribute to the slow and halting progress toward international standards.

“The Double Black Box” provides both a useful metaphor for thinking about national security AI and a timely overview of the current known uses of this technology. Deeks’s mastery of the topic shows in her fluency in discussing the internal dynamics affecting AI policy and in her focus on solutions that build mostly on existing mechanisms. Given the enormous risks that national security AI presents, though, even bolder reforms—which may be unattainable in the near term—deserve consideration as well.

Read Entire Article