30,000-foot level, such modes of interactions are broadly referred to as “humans in the loop” processes. But, as we will explain, there are many ways to design a workflow in which a human operator collaborates with a chatbot to support a customer. Specifically, we distinguish between the following six workflow configurations, all of which have a human in the loop: configurations 1 and 2 are often referred to as AI-enabled auditing, configurations 3 and 4 are also known as AI-assisted workflow, and configurations 5 and 6 correspond to fully automated workflows. 1. Human operator response with offline auditing by chatbot 2. Human operator response with real time auditing by chatbot 3. Chatbot recommendations with human operator deciding and responding 4. Chatbot preparation with human operator responding 5. Chatbot response with real time auditing by human operator 6. Chatbot response with offline auditing by human operator The decisions along the five chatbot design dimensions and the six workflow configurations provide executives with a blueprint to plan their efforts for reimagining customer support. Many other decisions, including privacy and security considerations as well as system integration, will also need to be made but are beyond the scope of this paper Part I: The Five Dimensions of Chatbot Design Dimension 1: Focused vs. Broad Knowledge Base How much does the bot know? A chatbot with a broad knowledge base can be created using frontier models such as Open AI’s GPT-4 (OpenAI et al. 2024) or Google’s Gemini (Gemini Team et al. 2024). Thanks to extensive training data, these chatbots can answer a wide range of questions without any form of special training, and more recently, can ingest text, documents, audio and video. As powerful as these frontier models might be, they do lack depth in specialized areas, especially when it comes to topics that highly depend on specific terms and operating policies. A high-profile example is Air Canada’s chatbot, which was reported to have offered discounted airfares to a customer requiring an urgent flight due to a death in the family. Though many airlines offer such bereavement tickets, Air Canada at that time did not (Melnick 2024). The fact that the Air Canada chatbot was trained on the broad body of knowledge typical for a frontier model (including the policies of many other airlines) as opposed to being focused exclusively on the Air Canada policies led to a frustrated customer and bad publicity. Technical considerations. From a technical perspective, using an existing model like GPT-4, which has been trained on extensive public data, is the simplest approach to start using LLMs. Depending on the nature of the task, some hyperparameter tuning, such as picking a lower (more deterministic) or higher (more creative) temperature value might be helpful. For instance, when computing the dosage of a specific

Reimagining Customer Service Journeys with LLMs: A Framework for Chatbot Design and Workflow Integration - Page 2 Reimagining Customer Service Journeys with LLMs: A Framework for Chatbot Design and Workflow Integration Page 1 Page 3