Learn practical skills, build real-world projects, and advance your career

ITC-Hotels-gallery-img-2016-07-08-08-54-44.png

WEB SCRAPING ITC LUXURY HOTEL DATA FROM TRIPADVISOR AND PERFORMING EXPLORATORY DATA ANALYSIS ON SCRAPED DATA

image.png

Web scraping is the process of collecting and parsing raw data from the web so it can later be used for purposes like analysis etc. Python is a great language to scrape data.

In this project I have scraped data about ITC luxury collection hotels from www.tripadvisor.in

Tripadvisor is a popular website to search for best hotels, restaurants, sightseeing places and almost anything for
a nice trip. You can also browse through hundreds of millions of traveller reviews and opinions for a particular hotel, restaurant, tourist spot etc or look for recommendation based on other peoples experience.

ITC luxury brand includes 15 hotels across different parts of India. I have created a list of links for each individual hotel page to scrape the information

Here is an outline of the steps we will follow:

  1. Importing libraries
  2. Scrape the basic information available on website for each hotel and store it in a csv file.
    • Hotel Name
    • Hotel Rank in that particular region
    • Hotel Address
    • Hotel Rating
    • Languages spoken
    • Hotel Style
    • Restaurants nearby
    • Attractions nearby
    • Price range
    • No of rooms
  3. Scrape the recent 1000 reviews for each hotel and store in a csv file.
    • Review title
    • Date of stay
    • Review text
    • Review rating
  4. Data Cleaning
  5. Exploratory Data Analysis
  6. Summary of Insights
  7. References

Importing libraries