16 Ground-Up Builds, Fine-Tuning, and Integrating Small Business While the DOD’s efforts to broaden the contractor base in the early stages of AI implementation is necessary, in the long run, having large numbers of contractors working on disconnected and disparate models could undermine the effectiveness of Generative AI in serving the DOD overall. The largest “foundation models” offered by major developers, though with noteworthy near-term challenges,24 generally have broader functionality, allowing for fine-tuning into a wider range of applications. Developing new Gen-AI models from the ground up, while potentially could present comparable or even greater accuracy than foundation models for niche use cases,25 can require increased investments and time costs compared to fine-tuning generalist models that already demonstrate sufficient accuracy and performance. Therefore, contracting myriad nascent or lesser- resourced developers without consideration to the contextual performance requirements of the team risks allocating excessive time and budgetary resources to produce tools with less adaptability across the organization compared to simply fine-tuning foundation models. However, this does not mean that the DOD can’t or shouldn’t continue its longstanding and successful history of incorporating small and medium-sized businesses in Gen-AI programs. Over time, as the technology begins to mature and use-cases become standardized, the role of these businesses will involve a combination of ground-up builds where specific tool requirements exceed the functionality of foundation models, alongside supporting the implementation of foundation models through services such as technology stack assessments, use-case discovery, fine-tuning, or workforce training. While we do not provide a recommendation on how to prioritize and balance this combination, we do advise that teams in the DOD take sufficient time to determine the most effective approach based on their unique needs, before contracting. Transition to Acquiring at Scale DIU and CDAO offer effective centralized bodies to coordinate programs like Thunderforge across the joint-force, but the DOD will need to maintain this cohesion over the long run in order to effectively allocate budget and resources. In the years following the announcement of JADC2, for example, each service branch later launched battle network and digital modernization initiatives of their own including the Air Force Battle Management System, the Army’s Project Convergence, and the Navy’s Project Overmatch. Concerns began to emerge that these new initiatives were duplicative of JADC2,26 which was meant to serve as the centralized strategy for these technologies across the joint-force. Therefore, while DIU’s prototype contracts under Thunderforge represent a strong step for Gen-AI integration, as the technology’s rollout proliferates across the DOD, it will be prudent to anticipate and monitor for fragmentation in efforts across the joint-force. Operational and technological redundancies could emerge in the long run if this is not managed, leading to cost and performance inefficiencies, as well as operational delays in integrating and fielding the technology. From our conversations with members of the Gen-AI and defense community, we consistently heard that the amount of DOD budgeting for Gen-AI was less of a concern than the allocation of such resources. Specifically, we received feedback that internal development and fine-tuning of Gen-AI tools where possible, and further, assessing the impact of any given tool on the organization (e.g., determining that a $5m tool that supports one team, while a tool equal in cost could be scaled across multiple branches) would drive more value for the DOD in the current moment than solely focusing on vendor acquisition. This signals a key tradeoff that the DOD must weigh between outright acquiring tools from vendors and relying on them for end-to-end integration, or front-loading internal investment toward due diligence efforts that can be used to inform such acquisitions. In summary, by expediting the acquisition of AI tools in the short run, the DOD risks acquiring tools prematurely, and under-utilizing them long-term. To mitigate this risk when implementing LLMs and AI Agents, DOD teams will need to undergo an assessment process before outright purchasing the tools. We therefore see a need to establish best practices for such a process, which the DOD can deploy either in whole or in part. Doing so across the organization can ensure that investments of time, labor and budget are allocated efficiently, avoiding critical pitfalls during the DOD’s long-term efforts to integrate Gen-AI. Below, we explain our framework, and its various components. Generative AI Adoption in the US Military
Generative AI Adoption in the US Military Page 15 Page 17