In this guide, we'll dive into how to scrape Indeed job listings without getting blocked. The first time I tried to extract job data from this website, it was tricky. I thought a simple requests.get() would do the trick, but within minutes I was staring at a CAPTCHA wall. That’s when I realized I needed a proper web scraper with proxy rotation and headers baked in to scrape job listing data.
Now that I have figured out how to build a reliable Indeed scraper, I'll show you how to extract data from Indeed, including job titles, company names, locations, and links. The best tool for the job is ScrapingBee's API. It's an excellent solution for those who want an easy setup without needing extra tools.
We'll explore how our API removes barriers, such as IP bans and pagination, while ensuring seamless JavaScript rendering, allowing you to scrape Indeed job postings successfully.
Quick Answer (TL;DR)
ScrapingBee is a cloud service for web scraping, and its Indeed Scraping API lets you extract job postings efficiently with just one API call. All you have to do is run the correct code.
Here's a complete working example that allows you to scrape Indeed job postings:
from bs4 import BeautifulSoup
import requests
import csv
def fetch_indeed_jobs(api_key, query, location, pages=2):
results = []
for page in range(pages):
start = page * 10
search_url = f"https://www.indeed.com/jobs?q={query}&l={location}&start={start}"
response = requests.get(
'https://app.scrapingbee.com/api/v1/',
params={
'api_key': api_key,
'url': search_url,
'country_code': 'us'
},
timeout=20
)
if response.ok:
results.append(response.text)
return results
def parse_job_results(html_pages):
all_jobs = []
for html in html_pages:
soup = BeautifulSoup(html, 'html.parser')
cards = soup.select('div.job_seen_beacon')
for card in cards:
title = card.select_one('h2.jobTitle span')
company = card.select_one('span.companyName')
location = card.select_one('div.companyLocation')
job = {
'title': title.get_text(strip=True) if title else 'N/A',
'company': company.get_text(strip=True) if company else 'N/A',
'location': location.get_text(strip=True) if location else 'N/A'
}
all_jobs.append(job)
return all_jobs
def export_to_csv(job_list, filename='indeed_jobs.csv'):
keys = job_list[0].keys()
with open(filename, 'w', newline='', encoding='utf-8') as f:
writer = csv.DictWriter(f, fieldnames=keys)
writer.writeheader()
writer.writerows(job_list)
# Example usage
API_KEY = 'YOUR_API_KEY'
pages_html = fetch_indeed_jobs(API_KEY, 'software+engineer', 'new+york')
jobs = parse_job_results(pages_html)
export_to_csv(jobs)
Make sure to configure parameters, such as job title, company name, and country. Then, replace YOUR_API_KEY with your actual API key, and you're ready to extract data.
The provided code should help you build an efficient job scraper without requiring additional configuration of residential proxies and JavaScript rendering, as these are already working in the background.
However, if this quick answer has left you a bit puzzled, don't worry, I'll explain how this data extraction works step-by-step below.
How to Scrape Indeed Job Data with ScrapingBee
Let's walk through every step of the working Python example that uses our API to extract Indeed job listings. We'll scrape and parse the core job listing data most job seekers need, such as job titles of your choice, paired with a company name and country.
But before we begin, you can take a quick look at how data extraction with ScrapingBee actually works.
Set Up Your Environment
You won't receive any job postings if you can't set up your profile correctly. Therefore, ensure that you follow these steps thoroughly.
Download and install Python If you haven't installed Python yet, download it from the official website. When installing, make sure pip (Python's package installer) is included in your PATH.
Install the required packages
pip install scrapingbee beautifulsoup4
You're installing:
requests: to call ScrapingBee's API before you gather data.
beautifulsoup4: to parse the HTML response gathered by your job scraper.
Get your ScrapingBee API key Sign up or log in to ScrapingBee and get your API key from the dashboard.
Create a new Python file (I'll call mine indeed_scraper.py) and add:
import requests from bs4 import BeautifulSoup import json import csv API_KEY = 'YOUR_SCRAPINGBEE_API_KEY' # Replace this with your real key
Great, now you're ready to scrape data!
Make the API Call to Scrape Indeed
Let's run this code to enable your Indeed scraper:
def scrape_indeed_jobs(query, location, pages=1):
all_jobs = []
for page in range(pages):
start = page * 10
url = f"https://www.indeed.com/jobs?q={query}&l={location}&start={start}"
response = requests.get(
"https://app.scrapingbee.com/api/v1/",
params={
'api_key': API_KEY,
'url': url,
}
)
if response.status_code == 200:
html = response.text
jobs = extract_job_data(html)
all_jobs.extend(jobs)
else:
print(f"Error fetching page {page+1}: {response.status_code}")
return all_jobs
What's happening here? We're using pagination to fetch multiple pages of job postings. This means we're not just scraping the first page. Instead, for each page we:
Calculate the start parameter (which Indeed uses for pagination)
Format the Indeed URL with our search query and location
Make a request to ScrapingBee's Indeed API, passing our API key and the Indeed URL
Extract job data from each successful response
You may wonder if you need JavaScript rendering to access all the Indeed data. The answer is yes, but when you're creating an Indeed scraper with our platform, JavaScript rendering is enabled by default. It's one less thing to worry about when you're web scraping.
However, if you're feeling like disabling this feature or looking for more advanced settings to scrape Indeed job postings, check out our documentation.
Extract and Format the Data
Once we have the HTML, we need to parse it and extract the job information. This is where BeautifulSoup comes into play:
def extract_job_data(html):
soup = BeautifulSoup(html, 'html.parser')
jobs = []
for job_card in soup.select('div.job_seen_beacon'):
title_elem = job_card.select_one('h2.jobTitle span')
company_elem = job_card.select_one('span.companyName')
location_elem = job_card.select_one('div.companyLocation')
job = {
'title': title_elem.get_text(strip=True) if title_elem else None,
'company': company_elem.get_text(strip=True) if company_elem else None,
'location': location_elem.get_text(strip=True) if location_elem else None
}
jobs.append(job)
return jobs
HTML parsing is a bit like surgery – you need to know exactly where to look and what to extract. I've found that Indeed occasionally updates its HTML structure, so you might need to adjust these selectors if they stop working. So, if you get stuck, check out this HTML Parsing Tutorial.
Preparing yourself for data analysis
Now, let's add functions to output your Indeed job data. You can print it as JSON for immediate viewing:
if __name__ == "__main__":
jobs = scrape_indeed_jobs("python developer", "New York", pages=2)
print(json.dumps(jobs, indent=2))
Or save it to a CSV file for further data analysis in Google Sheets or Excel:
def save_to_csv(jobs, filename='jobs.csv'):
if not jobs:
print("No jobs to save.")
return
keys = jobs[0].keys()
with open(filename, 'w', newline='', encoding='utf-8') as f:
writer = csv.DictWriter(f, fieldnames=keys)
writer.writeheader()
writer.writerows(jobs)
# Usage
# save_to_csv(jobs)
By looping through the next page parameter (start=), you can scrape job titles, company names, and locations in one run. From there, it’s easy to save the data in CSV format or JSON for analysis.
Whether you’re tracking Python jobs, building a job board, or doing market analysis, an Indeed scraper built with ScrapingBee is a fast, scalable solution.
Tips for Scraping Indeed at Scale
In my experience, scraping job search websites at scale can feel like trying to fill a bucket with a hole in it. If you don't address certain challenges, your data collection efforts can be an absolute waste.
Here are some tips I've learned the hard way:
Handle rate limits when gathering Indeed data
Indeed will throttle your requests if you hit them too quickly. With our platform, you get automatic proxy rotation, but you should still space out your API calls for large-scale scraping. Consider adding a slight delay between pages:import time time.sleep(1) # Add a 1-second delay between page requests
Geo-targeting for regional job listings
Indeed shows different results based on your location. Our API's country_code parameter lets you specify which country's proxies to use. This is essential if you're targeting job postings in specific regions:params={ 'api_key': API_KEY, 'url': url, 'render_js': 'true', 'country_code': 'uk' # Use UK proxies for UK job listings }
Error handling is your friend
Web scraping is unpredictable. Add robust error handling to your code to catch and respond to failures:try: response = requests.get(...) except requests.exceptions.RequestException as e: print(f"Request failed: {e}") # Implement retry logic here
In the example below, we’ll combine everything into a single workflow that scrapes multiple pages, extracts job IDs and metadata from Indeed, and exports the results. This ensures your project setup is ready for any job search automation task.
Example: Paginate and Export 100 Job Postings
Now let’s put everything together to extract job listings at scale – for example, 100 results, which equals about 10 pages on Indeed.
This code combines pagination with CSV export:
def main():
query = "data scientist"
location = "remote"
pages = 10 # 10 pages × 10 results = ~100 jobs
print(f"Scraping {pages} pages of {query} jobs in {location}...")
jobs = scrape_indeed_jobs(query, location, pages)
filename = f"{query.replace(' ', '_')}_{location}_jobs.csv"
save_to_csv(jobs, filename)
print(f"Scraped {len(jobs)} jobs and saved to {filename}")
if __name__ == "__main__":
main()
Now you know how an Indeed scraper can scrape job postings across multiple pages, collecting structured job data. Instead of writing time-consuming code to manage proxies or headers such as Accept-Language and Accept-Encoding, our platform simplifies the web scraping process.
Scraping Indeed Job Listings the Easy Way
Scraping job data from Indeed doesn’t have to be time-consuming or frustrating. Our API lets you build a reliable Indeed scraper that extracts the same data you’d expect from a job board. In a matter of minutes, you can collect dozens of job titles and even job details like salary ranges or descriptions. Then you can export the data and sort it by company name.
You don't need to worry about proxies or headless browsers – these are on by default. That means you can focus on building your job scraper for market analysis, research, or even automating your next job search.
By combining our API with Python, you can:
Scrape Indeed job postings across multiple pages without hitting CAPTCHAs
Collect structured job openings in clean CSV format or JSON
Scale up your job scraping process to include advanced job position details
Use the same code to scrape other websites, not just Indeed
If you're ready to create your own Indeed scraper, sign up for ScrapingBee’s API and see how fast you can go from the first page of results to a complete dataset of job postings.
Frequently Asked Questions (FAQs)
Is scraping Indeed allowed?
Indeed's terms of service don't explicitly allow scraping. It's a bit like visiting a store where there's no clear "no photos" sign, but the security guard might still ask you to stop if they notice. For commercial purposes, I recommend looking into Indeed's public API options first. If you do scrape, make sure you respect robots.txt directives and don't overload their servers.
How to avoid getting blocked by Indeed?
Scraping without getting blocked by Indeed is less likely if you follow best practices. First, add delays between requests. Second, randomize the timing of your requests. Third, limit concurrent requests. And finally, use solutions, like ScrapingBee, to ensure your requests appear to come from a real browser.
How can I scrape jobs by keyword or location?
It's as simple as changing the query and location parameters in your URL. For example, to search for "machine learning" jobs in "Chicago", use this parameter: url = f"https://www.indeed.com/jobs?q={machine+learning}&l={Chicago}".
What is the best way to extract job details like salary or description?
To extract the job description, salary, and other details, you'll need to scrape the individual job pages. First, extract the job URLs from the search results page. Then make a separate request to each job URL. Parse the detailed page for salary, description, requirements, etc. This two-stage approach gives you complete job details.

Kevin worked in the web scraping industry for 10 years before co-founding ScrapingBee. He is also the author of the Java Web Scraping Handbook.