Configuration 1: Human operator response with offline auditing by chatbot In this first configuration, all the work related to responding to the customer is carried out by the human operator. In the case of our example, the hospital operates a call center or allows patients to contact their care team so that they can ask them questions such as “When is the last meal I am allowed to take?”, “How long will I have to rest after surgery?”, or “Can I have coffee the morning of my surgery?”. The human operator directly provides an answer. Note that this configuration still allows for some use of AI. For example, if the organization wants to evaluate the accuracy of the medical advice provided by their human operators, it could periodically audit the chats. In such an offline audit, the hospital could use an LLM to identify possible mistakes performed by the human operators and bring them to the attention of management. Such offline auditing by an LLM is not unique to configuration 1 and can be applied to any configuration. In short, Generative AI is used to improve the accuracy of the customer support and thereby reduces defects and enhances learning about common questions and mistakes. This defect reduction and additional learning should translate to higher efficiency and better customer satisfaction in the long run. Configuration 2: Human operator response with real-time auditing by chatbot The second configuration is the same as the first but features real-time AI oversight, instead of offline auditing, of the human operator during the customer support session by AI. Returning to our example, if a human operator provides incorrect or incomplete information to the patient (e.g., the operator gives the wrong co-pay information to the patient or fails to alert the patient that she will not be able to operate a vehicle right after the surgery), an LLM listening to the call or following the chat could instantly pick up the mistake and alert the human operator to it while the conversation with the patient is still ongoing. While this is a technically more complex implementation than many other configurations, it still provides the most autonomy to the human agent outside of configuration 1 Just like the first configuration, the immediate efficiency gains are relatively low. After all, the customer support is still carried out by human operators. However, the immediacy of the feedback that is now happening in real time leads to fewer defects and faster learning. These first two configurations are most relevant for transactional tasks that require relatively little cognitive work or active problem solving by the customer support person. However, they may be preferred when dealing with high-risk or high-compliance tasks, as they leave most of the autonomy with the human operator. Next, we will look at the role LLMs can play in helping with more challenging support requests. Configuration 3: Chatbot recommendations with human operator deciding and responding Returning to our scenario, consider a patient calling the hospital with a medical problem. The hospital or the healthcare network might have hundreds or even thousands of providers. Each provider differs in their

Reimagining Customer Service Journeys with LLMs: A Framework for Chatbot Design and Workflow Integration - Page 9 Reimagining Customer Service Journeys with LLMs: A Framework for Chatbot Design and Workflow Integration Page 8 Page 10