Learn practical skills, build real-world projects, and advance your career

San Francisco Crime Classification

  • Final Submission

import Packages

! pip install mpu --quiet
|████████████████████████████████| 69 kB 3.8 MB/s eta 0:00:011
import warnings
warnings.filterwarnings('ignore')

import json
import os
import pickle
import pandas as pd
import numpy as np

from mpu import haversine_distance
from tqdm import tqdm

from sklearn.preprocessing import StandardScaler
from sklearn.manifold import TSNE
from sklearn.model_selection import train_test_split
from sklearn.feature_selection import SelectKBest
from sklearn.feature_extraction.text import (
    CountVectorizer,
    TfidfVectorizer
)

Data Reading

  • In the featurization part, I have already fit the training data and transformed the test data.

  • I did not fit the test data again as it would create data leakage problem.

  • For more details, please visit to my jovian notebook.