Learn practical skills, build real-world projects, and advance your career

Analyzing 22-year Housing Prices in the United Kingdom

How to run the code

This tutorial is an executable Jupyter notebook hosted on Jovian. You can run this tutorial and experiment with the code examples in a couple of ways: using free online resources (recommended) or on your computer.

Option 1: Running using free online resources (1-click, recommended)

The easiest way to start executing the code is to click the Run button at the top of this page and select Run on Binder. You can also select "Run on Colab" or "Run on Kaggle", but you'll need to create an account on Google Colab or Kaggle to use these platforms.

Option 2: Running on your computer locally

To run the code on your computer locally, you'll need to set up Python, download the notebook and install the required libraries. We recommend using the Conda distribution of Python. Click the Run button at the top of this page, select the Run Locally option, and follow the instructions.

Jupyter Notebooks: This tutorial is a Jupyter notebook - a document made of cells. Each cell can contain code written in Python or explanations in plain English. You can execute code cells and view the results, e.g., numbers, messages, graphs, tables, files, etc., instantly within the notebook. Jupyter is a powerful platform for experimentation and analysis. Don't be afraid to mess around with the code & break things - you'll learn a lot by encountering and fixing errors. You can use the "Kernel > Restart & Clear Output" menu option to clear all outputs and start again from the top.

Introduction

When the pandemic happened unexpectedly, the UK housing prices were soaring. Some said increasing housing prices could be the "cheap money" floated into the housing market, as governments printed money to tackle the public health crisis in the short term, whereas others said more demands than supplies for nice houses as many companies were implementing 'Working From Home' policies. The price behaviour interested me.

The real estate economy plays a vital role in a country. My father runs a real estate developing company. He told me how the real estate industry affects China economy. For example, the government would involve legal regulations and taxation, either they support it or disencourage it. More construction projects mean more job supplies. The upper stream in the industry would be manufacturers for building materials, construction companies, etc, whereas the down stream are marketing companies, sales companies etc. My initial intention was to see how housing markets can reveal economic growth, but the main focus of this project is data visualization regarding the UK housing market 12-year data analysis.

Data selection

I spared efforts into the dataset selection on Kaggle, a website for data science practitioners to explore open datasets and publish notebooks. I summarized what I thought about datasets I found into a tweet thread. Link

alt

Having gone through the selection, I picked up 'UK Housing Price Paid data, from 1995 to 2017'.

alt

Metholodgy

The project is defined as Exploratory Data Analysis and my expectation is to understand the dataset and visualize them for practicing skills in Pandas, Matplotlib, Seaborn, Plotly, Forulim.

22 million records were challenging and gave me enough opportunities to stimulate problems in the real world.

Expected outcomes

While building a data science portflio, I'm preparing myself for advanced data science projects . However, I expected to see how to handle a massive dataset as well as the memory management.

alt

Installing packages for EDA