The Naive Bayes technique in text classification

Naive Bayes is a supervised classification algorithm that is based on Bayes theorem. It is a probabilistic algorithm. But, you might be wondering why it is called Naive. It is so because this algorithm works on an assumption that all the features are independent of each other. However, we are cognizant of the fact that independence of features might not be there in a real-world scenario. For example, if we are trying to detect whether an email is spam or not, all we look for are the keywords associated with spams such as Lottery, Award, and so on. Based on these, we extract those relevant features from the email and say that if given spam-related features, the email will be classified as spam.