Learn practical skills, build real-world projects, and advance your career
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfVectorizer

import numpy as np
import pandas as pd
import seaborn as sns

Lets have some sample document

sample_docs = ["This is a line in one document",
              "This is another line in another document",
              "Yet another line in third document",
              "This is also a line which is same as present in first document"]

Vectorize the document