To scrape an Indeed job portal using Python, you can use either Beautiful Soup or Scrapy. Beautiful Soup is great for smaller projects, while Scrapy is better suited for larger and more complex projects.
Here is an example of how to scrape job postings from Indeed using Beautiful Soup:
main.py398 chars16 lines
Explanation:
requests
module to fetch the HTML content from the URL.jobs
to store the extracted job postings.find_all
method on the Beautiful Soup object to find all the div
tags with a class
value of 'row'
.div
tag, we use the find_all
method to find all the a
tags with a data-tn-element
attribute value of 'jobTitle'
.a
tags and append the value of their title
attribute to the jobs
list.jobs
list to the console.Here is an example of how to scrape job postings from Indeed using Scrapy:
main.py377 chars12 lines
Explanation:
scrapy.Spider
.parse
method that will be called to handle the HTTP response from each URL visited by the spider.div
elements containing job postings.'job_title'
and the value is the job title extracted using another XPath selector.yield
the job dictionary to Scrapy, which will handle the output for us.gistlibby LogSnag