Scrapping Formula1 For Race Results Using Python.

banner-image

Formula One (also known as Formula 1 or F1) is the highest class of international racing for open-wheel single-seater formula racing cars sanctioned by the Fédération Internationale de l'Automobile (FIA). The World Drivers' Championship, which became the FIA Formula One World Championship in 1981, has been one of the premier forms of racing around the world since its inaugural season in 1950. The word formula in the name refers to the set of rules to which all participants' cars must conform. A Formula One season consists of a series of races, known as Grands Prix, which take place worldwide on both purpose-built circuits and closed public roads.

The page https://www.formula1.com/en/results.html/drivers.html provides details of race results. In this project we'll retrieve race results of all the races with respect to year using web scrapping: the process of extracting information from a website in an automated fashion using code. we'll use the python libraries requests and BeautifulSoup to scrape data from this page

Here's the project outline and the steps we'll fallow:

  1. Download the web page using requests library
  2. Parse the HTML source page using BeautifulSoup
  3. Write functions to scrape data such as Location of race, Race date, Winner name, Formula1 car, Race timing etc..
  4. Store the scrapped data into a dictionary
  5. From the dictionary create a data frame using Pandas
  6. Save the data into CSV file

By the end of the project we'll create a CSV file in fallowing format:

GrandPrix_location, Race_date, winner, car, No_of_laps, Race_finish_time, Year
Brazil,	07 Nov 2010, Sebastian Vettel, RBR Renault, 71, 1:33:11.803,2010
Abu Dhabi, 14 Nov 2010, Sebastian Vettel, RBR Renault, 55, 1:39:36.837, 2010
...
...
!pip install jovian --upgrade --quiet
import jovian
import requests
from bs4 import BeautifulSoup

# importing request library, and install BeautifulSoup

def get_race_results(year):
    url = 'https://www.formula1.com/en/results.html/' + year + '/races.html'
    response = requests.get(url)
    # check if the webpage is valid
    if response.status_code != 200:
        raise Exception('Failed to load page {}'.format(url))
    #parse through beatifulsoup
    doc = BeautifulSoup(response.text, 'html.parser')
    return doc
year = input('Eneter year ')
Eneter year 2023