How do you produce emotions? | Voice Air Knowledge Base

The model is sensitive to the wider situation surrounding each utterance - it assesses whether something makes sense by how it ties to preceding and succeeding text. This zoomed-out perspective allows it to intonate longer fragments properly by overlaying a particular train of thought, stretching multiple sentences with a unifying emotional pattern.

There are a couple of tips for producing emotions:

Context is key for generating specific emotions. Thus, one might get a happy output if one inputs laughing/funny text. Similarly, setting the context is key with anger, sadness, and other emotions.

Punctuation and voice settings lead to how the output is delivered.

Add emphasis by putting the relevant words/phrases in quotation marks.

For speech generated using a cloned voice, the speaking style contained in the samples you upload for cloning is replicated in the output. So, if the uploaded sample's speech is monotone, the model will struggle to produce expressive output.

These are the best tips for producing emotions, but do not guarantee the result. We will introduce features that allow for the control of emotions within the text.