Web scraping is the process of extracting and parsing data from websites in an automated fashion using a computer program. It’s a useful technique for creating datasets for research and learning.
This workshop will be held on Thursday, April 15, 10 PM IST/5:30 PM GMT. It will be held over YouTube and the recording will be available for you to watch later.
In this workshop, we’ll use Python and its ecosystem of libraries to scrape information from a website and create a dataset of CSV file(s).
Here are the steps we’ll follow to build a web scraping project from scratch:
Pick a website and identify the information to be scraped into a CSV file
Use the requests library to download web pages from the site programmatically
Use Beautiful Soup to parse and extract information from web pages
Create well-formatted CSV file(s) with the extracted information
Document and share your work online in the form of a Jupyter notebook or blog post