There are certain type of tasks that requires to generate variates of outputs for the same or very similar prompts. Example of these task is content generation. In this article I’ve looked for a way to measure the diversity of the generated texts and tested few ideas how can we improve it.
Let’s use the example task to generate text for a Birthday card, and you want to have a set of these.
prompt = "Write a happy birthday message"
By definition a Large Language Model’s task to predict the next most probable word there’s no surprise the generated output for the same input will be very similar. (Of course you can adjust this slightly with parameters like temperature, and repetition penalty).
Measure
So how to measure the similarity of the output. One idea is to use BLEU score in particular SacreBLEU. Here ve’re looking for lower score which should correlate with more diversity in the generated output.
Test
Here are few examples and scores for the task where the same prompt used to generate the Birthday message. (Mistral Instruct 7B Q8 model has been used)
System prompt: “You are a content writer”
Prompt: “Write a happy birthday message”
Reference: “Happy Birthday! 🎉🎂 Wishing you a wonderful day filled with joy, laughter, and all your favorite things. May this special day be the beginning of another amazing year, and may all your dreams come true. Enjoy your special day to the fullest!”
Out of 100 generation this highest sacreBLEU score was 100, which means the generated test was exact match. The lowest sacreBLEU score was 54.44 and the generated text:
“Happy Birthday! 🎉🎂 Wishing you a wonderful day filled with love, joy, and all your favorite things. May this year be even better than the last and may all your dreams come true. Cheers to another year of adventure, growth, and making amazing memories! Enjoy your special day to the fullest!”
How to improve?
If we want to see more variety in the generated texts we should alter the prompt. I’ve tested 3 different alterations:
- Add noise
Using Faker to add a name to the prompt to whom we generate the message - Ask the LLM to create an alternative similar prompt
System: “You are a language expert and you follow the exact instruction. Just respond with the result”
Prompt: “Create a variation of this sentence: {original prompt}” - Instruct the model to use styles
A list of writing styles has been generated with GPT3.5 and randomly used in the generation
writing_styles = [ “Descriptive”, “Persuasive”, “Narrative”, “Expository”, “Academic”, “Creative”, “Technical”, “Journalistic”, “Satirical”, “Stream of Consciousness”, “Epistolary”, “Poetic”, “Episodic”, “Didactic”, “Confessional” ]prompt = f'{orig_prompt} in {random_style} style'
Results
Using styles hist resulted the lowest scores and low volatility – measured by standard deviation of scores within an experiment – of these scores amongst the alternation methods. (Lower score is the better.)

Volatility (standard deviation) of regular generation: 9.24312457580621
Volatility (standard deviation) of generation with noise: 16.63831620145039
Volatility (standard deviation) of prompt alternation: 25.506756730292953
Volatility (standard deviation) of styles: 5.262860755807771
Reference: “Happy Birthday! 🎉🎂 Wishing you a wonderful day filled with love, joy, and all your favorite things. May this special day be the beginning of another amazing year, and may all your dreams come true. Enjoy your special day to the fullest!”
Highest score with styles: 21.49
Style: Creative
Generated text: “Happy Birthday to the one who fills our lives with joy and laughter! Today we celebrate you, and all the amazing things you bring to the world. May your special day be as wonderful as you are, and may all your dreams come true. Cheers to another year of adventure, growth, and endless possibilities. Happy Birthday!”
Lowest score with styles: 1.12
Style: Expository
Generated text: “A Happy Birthday Message in an Expository Style\n\nToday is a special day, as we celebrate the birth of someone who has made a significant impact on our lives. This person may be a friend, family member, or even a stranger whose actions have touched us in some way. In this expository style message, we will explore the importance of birthdays and how they can bring joy and happiness to our lives.\n\nBirthdays are a time for celebration and reflection. They remind us to take a moment to appreciate the people who have made a difference in our lives and to reflect on the memories we have shared with them. Birthdays also serve as a reminder that life is precious and that we should make the most of” (generation has been limited at 150 tokens)
Second lowest score has slightly better data:
Second lowest score with styles: 1.17
Style: Episodic
Generated text: “Happy Birthday to you! Today, we’re going to take a trip down memory lane and relive some of the best moments of your life.\n\nFirst, let’s go back to when you were just a little one, running around with your friends and chasing butterflies in the park. You were so full of energy and curiosity, always eager to learn new things and explore the world around you.\n\nAs you grew older, you discovered your passion for music and started playing the guitar. You spent countless hours practicing and perfecting your craft, and before you knew it, you were performing on stage in front of a packed crowd!\n\nThen there was that time when you went on a backpacking trip”
Conclusion
By eyeballing the data SacreBLEU score showes correlation wiuth the diversity of the generated text. Hinting styles to the model successfully reduces this similarity score, though the styles themselves might be resonsible for some hallucinations. Next step is to test with more specific original prompt.
Notes:
temp=0.2 has been used for the generations
temp=0.9 has been used for prompt alternations