Before any text interpretation strategies can be applied, it is necessary to carry out a series of elaborations. In particular, the following phases are important:
- Text cleaning: The text is cleaned of all the elements that can alter subsequent analyzes (for example, spaces at the beginning and at the end of the message)
- Verification of the characters of the text: It checks whether the text contains characters equivalent to others that could invalidate subsequent analysis
- Text normalization: Transformation of uppercase characters into lowercase is done so that the same word written with a capital letter instead of lowercase is interpreted in the same way (this approach is not always optimal since capitalization can sometimes have a discriminatory value)