Executing TF-IDF in Python

The following are the steps for executing TF-IDF in Python:

from sklearn.feature_extraction.text import TfidfVectorizer

corpus = ['First document', 'Second document','Third document','First and second document' ]

vectorizer = TfidfVectorizer()

X = vectorizer.fit_transform(corpus)
print(vectorizer.get_feature_names())
print(X.shape)

The output is as follows:

X.toarray()

We get the following output: