Learn practical skills, build real-world projects, and advance your career

CHOCOLATE BAR RECIPE TREND ANALYSIS
(2006-2020)


alt

Here we will study chocolate bar ingredient trends, preferences by companies and its rating. We will mostly use Numpy, Pandas to compute the results and, Matplotlib & Seaborn for plotting graphs. The dataset used in this project is taken from kaggle.com and contains data about 66 chocolate bar companies such as 'company', 'company_location', 'country_of_bean_origin', 'review_date', chocolate 'rating', 'cocoa_percent', common ingredients and tastes information.



According to Flavors of Cacao Rating Scale:


4.0 - 5.0  =  Outstanding

3.5 - 3.9  =  Highly Recommended

3.0 - 3.49  =  Recommended

2.0 - 2.9  =  Disappointing

1.0 - 1.9  =  Unpleasant


Table of contents

  • Downloading the Dataset
    • Python’s OS module
    • Pandas "read_csv” method
  • Data Preparation and Cleaning
    • .columns,.sample(), .shape, .sum()
    • .unique(), .apply(), lambda
    • .value_counts(), .info(), .describe()
  • Exploratory Analysis and Visualization
    • Company And Ingredients
    • Tastes
    • Percentage of Cocoa and Variation Over Years
    • Rating and Cocoa Percent
    • Correlation between different columns
  • Questions and Answers
    • How presence of cocoa butter and lecithin effect rating in latest three years (2018-2020)?
    • How much cocoa is actually preferred by top companies?
    • From which countries, top companies import cocoa beans?
    • What must have been the recipe of top rated chocolate in the year 2019?
    • What tastes are in top rated chocolates during 2016-2020?
    • What are the major regions of chocolate, companies of which, generally makes it to Top 50?
  • Inferences and Conclusion
  • References and Future Work

Downloading the Dataset

  • Let's download our dataset from kaggle.com in our local computer folder for analysis or download the same using the opendatasets Python library
  • This data is in CSV format.
  • This same original dataset will be later uploaded at Jovial files section along with this project, using command:
    jovian.commit(project=project_name, files=['chocolate-bar-2020/chocolate.csv'])
  • If downloaded in a local computer, we create a local Conda environment and install the required libraries by running the commands at terminal, for example:

conda create -n zerotopandas -y python=3.8

conda activate zerotopandas

pip install jovian jupyter numpy pandas matplotlib seaborn --upgrade

We can now start the Jupyter notebook using command:

jupyter notebook

!pip install opendatasets --upgrade --quiet
dataset_url = 'https://www.kaggle.com/soroushghaderi/chocolate-bar-2020'