This is an important stage because it reveals the biases you have in your training data, the user's perception of your product, and other potential pain points. Try to check the model in the most realistic scenarios and conditions. Suppose you are developing a voice assistant. How will it work:
- On a noisy street when the wind is blowing?
- When a child is crying in the background?
- When some music is playing?
- When the user is not a native speaker or is getting emotional or drunk? Well, those may be the users who need your assistance the most!
- In all of these cases together?
If your solution is security-related, how good is it in the event of an active attack by an adversary? How easy is it to unlock your touch ID with the help of an orange, or cheat your face recognition by presenting it with a photo?
Having all those observations, you then go back and update your dataset and models accordingly.