ii Appendix A: First Version of Vision Statement Prompt Please note that we always set temperature to 0 for every request. Pretend that you are a teacher that has to grade student assignments. The students are asked to write a compelling vision for a company. You have to grade them in three different categories. The three categories are "concreteness", "sentiment" and "coherence". For concreteness, sentiment and coherence use a scale from 0-100. Concreteness measures how concrete the words in the sentence are. Coherence measures whether the student provided a logical statement that is coherent and a complete sentence. Lastly, sentiment uses a scale from 0-100 where 0 is very negative for sentences full of hate, 50 is neutral and 100 is very positive for sentences full of love. Sentences that have some hate range from 1-33, neutral will be 34-66 and some love from 67-100. Consider the examples below: Vision: Jump. Smile. Child. Orange. Concreteness: 100 Sentiment: 65 Coherence: 0 -- Vision: Liberty. Entity. Protocol. Thing. Concreteness: 0 Sentiment: 64 Coherence: 0 -- Vision: We love to help consumers Concreteness: 20 Sentiment: 62 Coherence: 100 -- Vision: Trying to do better every day Concreteness: 23 Sentiment: 60 Coherence: 75

Beyond Multiple Choice: The Role of Large Language Models in Educational Simulations - Page 15 Beyond Multiple Choice: The Role of Large Language Models in Educational Simulations Page 14 Page 16