Learn practical skills, build real-world projects, and advance your career

Measures of Central Tendency | Statistics for Data Science

This tutorial is a part of the Zero to Data Analyst Bootcamp by Jovian

alt

Statistics is the discipline of using mathematics to understand data. We use measures of central tendency to summarize, discover and share useful information about data, primarily for gaining insight and making better decisions. The topics covered today will help you answer the following types of questions:

  • How much will you earn after graduating from university?
  • How hot is it going to be in the summer this year?
  • Should your investment portfolio be focused on diversified?
  • Which movie should you watch this weekend?
  • How many registered users will your website have one year from now?

This tutorial covers the following topics:

  • Average / arithmetic mean
  • Median, percentiles, quartiles and range
  • Mode and frequency tables
  • Variance and standard deviation
  • Growth rate and geometric mean

How to Run the Code

The best way to learn the material is to execute the code and experiment with it yourself. This tutorial is an executable Jupyter notebook. You can run this tutorial and experiment with the code examples in a couple of ways: using free online resources (recommended) or on your computer.

Option 1: Running using free online resources (1-click, recommended)

The easiest way to start executing the code is to click the Run button at the top of this page and select Run on Binder. You can also select "Run on Colab" or "Run on Kaggle", but you'll need to create an account on Google Colab or Kaggle to use these platforms.

Option 2: Running on your computer locally

To run the code on your computer locally, you'll need to set up Python, download the notebook and install the required libraries. We recommend using the Conda distribution of Python. Click the Run button at the top of this page, select the Run Locally option, and follow the instructions.

Average / Arithmetic Mean

The average or arithmetic mean of a set of numbers is the sum of the numbers divided by the how many numbers are being averaged. It's the simplest way to come up with a single number to summarize a set of numbers.

Average=Sum of valuesNo. of values\textrm{Average} = \frac {\textrm{Sum of values}} {\textrm{No. of values}}

μ=x1+x2+...+xnn\mu = \frac{x_1 + x_2 + ... + x_n} {n}

The symbol μ\mu (pronounced "myu") is often used to denote the mean. We can define a function mean to implement this formula.

def mean(nums):
    total = 0
    for num in nums:
        total += num
    return total / len(nums)