How to Scrape Glassdoor: Job Titles, Salaries, and Company Ratings

02 September 2025 | 14 min read

Trying to learn how to scrape Glassdoor data? You're at the right place. In this guide, I’ll show you exactly how to extract job title descriptions, salaries, and company information using ScrapingBee’s powerful API.

You may already know this – Glassdoor is a goldmine of information, but scraping it can be a challenging task. The site utilizes dynamic content loading and sophisticated bot protection. As a result, the Glassdoor website is out of reach for an average web scraper. I’ve spent countless hours battling these defenses with custom solutions with no luck.

Now I only use this solution as my main webscraper. Simply, because it handles all the complex stuff for you – proxies, browser headers, and JavaScript rendering. ScrapinBee makes it easier to focus on the Glassdoor data rather than sourcing and configuring proxies.

I'll show exactly how straightforward this service is. By the end of this guide, you'll know how to build a reliable Glassdoor scraper to access data, such as company details, job descriptions, and employee reviews – all with basic knowledge of Python coding. Let's dive in.

Quick Answer (TL;DR)

ScrapingBee is powered by a Glassdoor Scraping API that includes JavaScript rendering and custom headers. Using it is extremely simple. Start by defining your target Glassdoor URL and extract job titles, companies, ratings, and salaries with this code:

from scrapingbee import ScrapingBeeClient
import json

# Step 1: Initialize ScrapingBee client
api_key = "YOUR_API_KEY"
client = ScrapingBeeClient(api_key=api_key)

# Step 2: Define target URL (job listings for a search term)
url = "https://www.glassdoor.com/Job/software-engineer-jobs-SRCH_KO0,17.htm"

# Step 2: Set up extract rules for job data
extract_rules = {
  "jobs": {
    "selector": ".react-job-listing",
    "type": "list",
    "output": {
      "title": {"selector": ".jobLink", "output": "text"},
      "company": {"selector": ".jobEmpolyerName", "output": "text"},
      "location": {"selector": ".jobLocation", "output": "text"},
      "salary": {"selector": ".salarySnippet", "output": "text"},
      "rating": {"selector": ".jobRating span", "output": "text"}
    }
  }
}

# Step 3: Call ScrapingBee with extract_rules
response = client.get(url, params={
    'extract_rules': extract_rules,
    'render_js': True  # ensure JS content is rendered
})

data = response.json()

# Step 4: Iterate and display results
for job in data.get("jobs", []):
    print("Title  :", job.get("title"))
    print("Company:", job.get("company"))
    print("Location:", job.get("location"))
    print("Salary :", job.get("salary"))
    print("Rating :", job.get("rating"))
    print("-" * 40)

This code snippet demonstrates the core functionality, but there’s much more to learn about effectively scraping Glassdoor data. Let’s break down this process step by step so you understand exactly how it works.

How to Scrape Glassdoor with ScrapingBee

If you tried web scraping before, you've certainly faced JavaScript rendering issues and anti-bot measures. Glassdoor loads content dynamically, which means a simple HTTP request won’t work. You need to arm yourself with a browser environment to execute JavaScript and render the page properly.

That's what I like about our platform. It solves this problem by providing a headless browser infrastructure that renders pages just like a real browser would.

When you make a request through ScrapingBee, the scraping process looks like this:

  1. It sends the request through clean proxies to avoid IP blocks

  2. Then, renders all JavaScript on the page

  3. Waits for the content to fully load

  4. Extracts the data according to your specifications

I’ve found this approach much more reliable than trying to maintain my own proxy infrastructure or browser automation setup. If you want to learn more about our platform's capabilities, check out the documentation.

Understanding Glassdoor’s Structure to extract company details

Before diving into the code, it’s important to understand how Glassdoor organizes its data. The website follows specific URL patterns that make systematic scraping possible:

Each of these sections contains valuable data fields that can be extracted using the right selectors. The company ID is a unique identifier assigned by the Glassdoor website to each company. Keep this in mind when web scraping – this ID is essential for targeting specific company data.

Set Up Python and ScrapingBee

Now it's time to kick-start the web scraping process. First, you’ll need to set up your environment. Here’s how you should do it:

  1. We'll be scraping with Python, so go to the official website and download it.

  2. Run the installer and make sure to check the box that says “Add Python to PATH” before clicking Install.

  3. Verify the installation by opening a terminal or command prompt and typing:

    python --version
    

    You should see the Python version you installed.

  4. If you haven't already, sign up at ScrapingBee and grab your API key from the Dashboard.

API key

  1. Install the ScrapingBee Python SDK (includes requests logic and simplifies extract rules):
pip install scrapingbee
  1. Now, initialize the client in your Python script:
from scrapingbee import ScrapingBeeClient

api_key = "YOUR_API_KEY"  # Replace with your actual key
client = ScrapingBeeClient(api_key=api_key)

The ScrapingBee SDK makes the whole process much simpler than writing raw requests code. It handles authentication, request formatting, and response parsing for you.

One feature I particularly like is the JavaScript scenario feature. This allows you to execute custom JavaScript on the page before extraction. For example, you could use it to:

  • Click on elements to load more content

  • Scroll down to trigger lazy loading

  • Dismiss popup dialogs that might block content

  • Fill in forms or perform searches

Here’s an example of using a JavaScript scenario to scroll down a Glassdoor page to load more reviews:

scroll_script = """
  // Scroll to bottom of page
  window.scrollTo(0, document.body.scrollHeight);
  
  // Wait for content to load
  await new Promise(r => setTimeout(r, 2000));
  
  // Scroll again to trigger more loading
  window.scrollTo(0, document.body.scrollHeight);
  
  // Final wait for content
  await new Promise(r => setTimeout(r, 2000));
"""

response = client.get(
    url,
    params={
        'render_js': True,
        'js_snippet': scroll_script
    }
)

This capability is particularly useful when scraping interview reviews that load incrementally as you scroll down the page.

Data extraction with CSS Selectors or Extraction Rules

Now, let's use your Glassdoor scraper to extract data.

When you scrape Glassdoor reviews, you gain valuable insights into company culture and employee satisfaction. The key to successful extraction is identifying the right CSS selectors.

Glassdoor uses React and loads data dynamically, but many fields are embedded in the initial HTML or GraphQL payload. For demonstration, we’ll target a job listing page for a particular search term:

https://www.glassdoor.com/Job/software-engineer-jobs-SRCH_KO0,17.htm

Glassdoor

Using extract rules is the most efficient way to get structured data from Glassdoor.

Here’s how to set them up:

url = "https://www.glassdoor.com/Job/software-engineer-jobs-SRCH_KO0,17.htm"

extract_rules = {
  "jobs": {
    "selector": ".react-job-listing",
    "type": "list",
    "output": {
      "title": {"selector": ".jobLink", "output": "text"},
      "company": {"selector": ".jobEmpolyerName", "output": "text"},
      "location": {"selector": ".jobLocation", "output": "text"},
      "salary": {"selector": ".salarySnippet", "output": "text"},
      "rating": {"selector": ".jobRating span", "output": "text"}
    }
  }
}

This configuration tells ScrapingBee to:

  • Find all elements matching .react-job-listing (each job data card)

  • For each job card, extract the text from the specified selectors

  • Return the data as a structured JSON object

I always recommend previewing selectors in browser developer tools using document.querySelector to confirm they’re valid before running your scraper. This saves time and prevents errors in your extraction process.

For example, to test if the job title selector works, you could open the console tab in your browser’s developer tools while viewing a Glassdoor job listing data and type:

document.querySelectorAll('[data-test="job-title"]').forEach(el => 
  console.log(el.textContent.trim())
);

Sample

If this returns the job title correctly, you know your selector is working.

Once you’ve set up your extraction rules, making the request is straightforward:

response = client.get(url, params={
    'extract_rules': extract_rules,
    'render_js': True  # ensure JS content is rendered
})

data = response.json()

The response will contain your structured data, ready for analysis or storage.

Scraping Different Types of Glassdoor Data

Glassdoor contains a wide variety of valuable information beyond just job titles. For anyone interested in career research, salary benchmarking, creating a job board, or market insights, it’s useful to understand the different categories to ensure data accuracy.

With a web scraping tool like ScrapingBee, you can simulate a browser request, bypass many anti-bot measures, and reliably fetch every data set. Let's take a look at how it works in action.

Scraping Glassdoor Reviews

To scrape reviews, you’ll need to target the reviews page and extract each review element.

Use this Python script to fetch Glassdoor reviews:

def scrape_company_reviews(company_name, company_id, page=1):
    url = f"https://www.glassdoor.com/Reviews/{company_name}-Reviews-E{company_id}_P{page}.htm"
    
    extract_rules = {
        "reviews": {
            "selector": ".empReview",
            "type": "list",
            "output": {
                "title": {"selector": ".reviewLink", "output": "text"},
                "rating": {"selector": ".ratingNumber", "output": "text"},
                "position": {"selector": ".authorJobTitle", "output": "text"},
                "date": {"selector": ".authorInfo .date", "output": "text"},
                "pros": {"selector": ".pros", "output": "text"},
                "cons": {"selector": ".cons", "output": "text"},
                "advice": {"selector": ".adviceMgmt", "output": "text"}
            }
        }
    }
    
    response = client.get(
        url,
        params={
            'extract_rules': extract_rules,
            'render_js': True
        }
    )
    
    return response.json()

This function allows you to scrape employee reviews for a specific company, with pagination support. The extracted data includes the review title, rating, employee position, date, pros, cons, and advice to management.

Scraping Salary Data with a GlassDoor Scraper

Salary information is another valuable data point available on Glassdoor. Here’s how to extract it:

def scrape_salary_data(company_name, company_id):
    url = f"https://www.glassdoor.com/Salary/{company_name}-Salaries-E{company_id}.htm"
    
    extract_rules = {
        "salaries": {
            "selector": ".salaryRow",
            "type": "list",
            "output": {
                "job_title": {"selector": ".jobTitle", "output": "text"},
                "salary_range": {"selector": ".salaryRange", "output": "text"},
                "base_pay": {"selector": ".basePay", "output": "text"},
                "additional_pay": {"selector": ".additionalPay", "output": "text"},
                "sample_size": {"selector": ".sampleSize", "output": "text"}
            }
        }
    }
    
    response = client.get(
        url,
        params={
            'extract_rules': extract_rules,
            'render_js': True
        }
    )
    
    return response.json()

This function extracts salary information for different job titles within company pages, including salary ranges, base pay, additional pay, and sample size.

Scraping Interview Questions

Interview questions can provide valuable insights for job seekers. Here’s how to extract them:

def scrape_interview_questions(company_name, company_id):
    url = f"https://www.glassdoor.com/Interview/{company_name}-Interview-Questions-E{company_id}.htm"
    
    extract_rules = {
        "interviews": {
            "selector": ".interviewQuestion",
            "type": "list",
            "output": {
                "question": {"selector": ".questionText", "output": "text"},
                "job_title": {"selector": ".jobTitle", "output": "text"},
                "difficulty": {"selector": ".difficultyLabel", "output": "text"},
                "experience": {"selector": ".interviewReview", "output": "text"}
            }
        }
    }
    
    response = client.get(
        url,
        params={
            'extract_rules': extract_rules,
            'render_js': True
        }
    )
    
    return response.json()

This function extracts interview questions, associated job titles, difficulty ratings, and interview reviews.

Exporting Scraped Glassdoor Data

Once you’ve collected company data, the next step is to store the scraped data in a format that’s easy to analyze or share. The two most common options are CSV files and JSON files.

Export to JSON

ScrapingBee’s API already returns data as JSON by default. You can save it directly to a file like this:

import json

with open("glassdoor_data.json", "w") as f:
    json.dump(data, f, indent=2)

This format is great if you plan to process the data programmatically or feed it into another application.

Export to CSV

If you want to work with your results in Excel, Google Sheets, or a data analysis tool, convert them into CSV:

import csv

with open("glassdoor_data.csv", "w", newline="", encoding="utf-8") as f:
    writer = csv.writer(f)
    writer.writerow(["Title", "Company", "Location", "Salary", "Rating"])  # headers
    
    for job in data.get("jobs", []):
        writer.writerow([
            job.get("title"),
            job.get("company"),
            job.get("location"),
            job.get("salary"),
            job.get("rating")
        ])

This creates a spreadsheet-friendly version of your scraped Glassdoor dataset, making it easy to filter, chart, or compare different roles across company pages and job boards.

Additional Considerations

When web scraping Glassdoor, there are three key technical challenges to keep in mind: respecting rate limits, managing pagination, and waiting for dynamically loaded elements. Handling these properly ensures your scraper runs smoothly, avoids unnecessary errors, and minimizes the risk of being blocked.

Waiting for Elements

If job listings load slowly, you can use the wait_for parameter to ensure content is fully loaded before extraction:

response = client.get(url, params={
    'extract_rules': extract_rules,
    'render_js': True,
    'wait_for': '.react-job-listing'  # Wait until this selector appears
})

Pagination

Glassdoor pages use React and AJAX for pagination. For basic listings, you can modify the URL to access different pages. For example:

# Page 1
url = "https://www.glassdoor.com/Job/software-engineer-jobs-SRCH_KO0,17.htm"

# Page 2
url = "https://www.glassdoor.com/Job/software-engineer-jobs-SRCH_KO0,17_IP2.htm"

For deeper scraping (like all reviews, job descriptions, or salary history), you might need to use GraphQL extraction or implement scrolling strategies using JavaScript scenarios.

Respect Rate Limits & Robots.txt

Even with ScrapingBee handling the technical aspects, it’s important to scrape responsibly:

  • Space out your requests to avoid overwhelming the site

  • Don’t extract more data than you need

  • Check Glassdoor’s robots.txt for any specific directives

  • Consider using the official Glassdoor API if available for your use case

While Glassdoor may be legal to scrape, scraping responsibly is the first line of defense. However, it may not always be enough. Glassdoor employs a range of anti-bot measures designed to detect and block automated traffic. In the next section, we’ll explore how to recognize these barriers and strategies to handle them effectively.

Handling Glassdoor Anti-Bot Measures

Learning how to avoid getting blocked on Glassdoor can save you hours of development time. One of the biggest challenges is dealing with anti-bot protection.

You’ll need a few specific web scraping tools, which I’ll cover below.

User-Agent Rotation

Glassdoor tracks browser fingerprints to identify Glassdoor scrapers. Our solution rotates user-agents to make each request appear to come from a different browser:

response = client.get(url, params={
    'render_js': True,
    'premium_proxy': 'true'  # Uses premium proxies with rotating user-agents
})

IP Rotation

To avoid getting blocked when scraping Glassdoor, IP address rotation is crucial. ScrapingBee automatically rotates IP addresses to prevent Glassdoor from detecting patterns in your requests:

response = client.get(url, params={
    'render_js': True,
    'country_code': 'us'  # Optionally specify country for IPs
})

JavaScript Rendering

Since Glassdoor relies heavily on JavaScript, our platform renders the full page just like a real browser:

response = client.get(url, params={
    'render_js': True,
    'premium_proxy': 'true'
})

I’ve found that these features, when combined, make our platform significantly more reliable than custom solutions. In my experience, custom scrapers for Glassdoor typically need constant maintenance as the site updates its defenses.

Why Scrape Glassdoor Data?

Now that you're familiar with the technical details of how to scrape Glassdoor, let’s explore why this data is so valuable. Glassdoor provides a wide range of information that benefits various stakeholders.

Market Research and Competitive Intelligence

Glassdoor provides unique insights that aren’t easily available elsewhere. When you extract data from Glassdoor, you gain access to:

  • Salary benchmarks: Understanding compensation trends across industries, roles, and locations

  • Company culture insights: Employee sentiment analysis based on reviews

  • Benefits information: You can compare what perks and benefits companies offer

  • Interview processes: How companies conduct their hiring

For example, a tech startup might scrape Glassdoor job data to understand the competitive salary landscape before setting its own compensation packages. This helps them stay competitive without overspending.

Recruitment and HR Applications

HR professionals and recruiters can leverage a Glassdoor scraper to:

  • Track employer brand reputation through company reviews

  • Monitor employee satisfaction trends

  • Identify common complaints or praise points

  • Benchmark their company against competitors

I once worked with a mid-sized tech company that utilized scraped Glassdoor company reviews to identify and address recurring themes in negative feedback, ultimately improving their retention rates.

Investment Research

Investors and financial analysts often use web scraping Glassdoor to gather intelligence on companies they’re evaluating:

  • Employee sentiment as a leading indicator of company performance

  • Job openings may signal expansion or contraction

  • Executive approval ratings

  • Salary growth or stagnation

The data points available through Glassdoor data extraction can provide valuable signals that complement traditional financial analysis.

Unlocking Glassdoor Data with Web Scraping

With the right web scraper, Glassdoor becomes a rich source of public data on job postings, company pages, and salary information from current and former employees. Using a tool like ScrapingBee makes it simple to extract desired data fields – from job openings and company names to total pay and reviews – without worrying about JavaScript, CAPTCHAs, or IP blocks.

The scraped results can easily be exported into a CSV file or JSON file, ready for analysis or integration into a job board or HR project. Whether you’re tracking machine learning engineer roles on the first page of a search or collecting data across more pages, our platform handles the heavy lifting so you can focus on insights.

Always scrape responsibly: respect Glassdoor’s servers, check robots.txt, and confirm whether it’s legal to scrape for your use case.

With these practices in place, you can reliably collect all the data you need from Glassdoor and turn it into actionable insights.

Start Scraping Glassdoor with ScrapingBee Today

Ready to extract valuable insights from Glassdoor? ScrapingBee makes it simple to get started:

  1. Sign up for a free account at scrapingbee.com

  2. Get 1,000 free API credits to test your Glassdoor scraping

  3. Use the code examples from this guide to start extracting job listings, company reviews, and salary data

The setup takes less than 5 minutes, and you’ll save countless hours compared to building and maintaining your own scraping infrastructure.

Frequently Asked Questions (FAQs)

Web scraping publicly available data is generally legal, but you should use the data responsibly. Avoid republishing content verbatim, respect Glassdoor’s terms of service regarding data usage, and consider consulting legal advice for your specific use case.

Why does Glassdoor block my scraper?

Glassdoor blocks scrapers to protect its data and server resources. They detect unusual patterns like too many requests from one IP, missing cookies/headers, or bot-like behavior. ScrapingBee helps avoid these issues by mimicking real user behavior.

Can I get salary info from Glassdoor listings?

Yes, you can scrape salary information from Glassdoor job listings when available. Not all listings include salary data, but when present, it can be extracted using the .salarySnippet selector as shown in our examples.

Does ScrapingBee work for logged-in Glassdoor pages?

ScrapingBee can handle some login-protected content using cookies. For simple cases, you can extract cookies from your browser and pass them with your request. For more complex scenarios, ScrapingBee’s JavaScript scenario feature can automate the login process.

image description
Kevin Sahin

Kevin worked in the web scraping industry for 10 years before co-founding ScrapingBee. He is also the author of the Java Web Scraping Handbook.