14 Data Rights and Licensing Agreements Data rights are a central point of contention in licensing agreements between private sector vendors providing large models and the DOD. More specifically, the DOD stores and processes massive amounts of data, which are stored and organized highly disparately across the organization. Further, much of this information is Sensitive But Unclassified (SBU) as well as Classified. Major vendors offering tools like Large Language Models often require access to client data to further fine-tune the model; however, in the DOD, stricter data controls prevent models trained on proprietary or classified data to be licensed back out. This creates significant challenges to identifying adequate licensing and pricing agreements between the DOD and commercial vendors, and if the DOD rushes acquisition, it may find itself locked into contract agreements that become inefficient or obsolete over time and thereby serve neither the DOD nor its vendors in the long run. One Generative AI technology that has been applied to the issue of data rights is the use of Synthetic Data, albeit it is still early in development. The critical challenge that startups offering synthetic data products are working to address includes the DOD’s top priorities for data security, which include Privacy and Fidelity. For Privacy, the primary concern for synthetic data is to filter out any datapoints that by implication expose the original dataset (for example, a synthetic dataset listing government contracting business owners by net worth, with 2-3 data points in excess of $10 billion, implying just a handful of actual people in the real data set). For Fidelity, focusing on replicating the original data set’s distribution, correlation, and for time series data, interarrival-time, is critical to ensuring it is not too ‘clean’ or too imbalanced, thereby reflecting the characteristics of the original data set as accurately as possible. For any licensing and pricing agreement, however, contracting with the DOD will involve balancing data security with pricing adjustments for any model training that does occur. Teams will also need to forecast how pricing for a handful of pilot-phase users translates to scaled implementation with potentially thousands of users. We recommend that the DOD prioritize refining Gen-AI contracting practices that are likely to stand the test of time and serve both parties in the long run—a process that will take dedicated funding, labor, and time to effectively develop. Balancing Optimization and Resiliency Through Red Teaming While AI tools present a compelling case for addressing inefficiencies across the military related to maintenance, over-relying on technology to monitor and optimize these functions can detract from the force’s broad operations, specifically its ability to manage inherently unpredictable events that profoundly impact—and even shutdown—other logistical functions in wartime settings. Such is particularly true for resupply chains and strategic planning, as Chris Daehnick, former US Air Force Colonel and a former Associate Partner at McKinsey, notes, Chris further adds: If a military builds its operations to be highly efficient and dependent on AI, it’s likely to be a fragile system that breaks easily; it can’t cope with what you don’t expect from past experience, similar to how an overly centralized command and control system can cause paralysis and break down under stress. Taking out slack from any system weakens the system’s ability to withstand future shocks. war is inherently wasteful, and over-optimizing reduces resilience and creates vulnerabilities in a military’s operational systems. “ “ Generative AI Adoption in the US Military
Generative AI Adoption in the US Military Page 13 Page 15