Just launched - Fast Search API: organic SERP data in under 1 second Try it now

Python web scraping JavaScript: How to scrape dynamic pages

09 March 2026 | 23 min read

Python web scraping JavaScript pages can feel confusing the first time you try it. You write a simple scraper with requests and BeautifulSoup, run it against a website, and instead of useful data you get an almost empty page. Meanwhile the browser clearly shows tables, prices, comments, or products.

The reason is simple: many modern websites build their content with JavaScript after the page loads. Your browser runs those scripts automatically, but a basic Python scraper only downloads the initial HTML.

In this guide, we'll break down how Python web scraping JavaScript pages actually works and what tools you can use to handle it. You'll learn how to recognize JS-rendered pages, how to extract data from hidden APIs, and when to use tools like Selenium if a real browser is required.

We'll also walk through a practical example so you can see the full workflow in action. By the end, you should have a clear path for scraping dynamic websites with Python, even if you're just getting started.

Python web scraping JavaScript: How to scrape dynamic pages

Quick answer (TL;DR)

Python web scraping JavaScript pages requires one extra step: you need a way to execute the JS before extracting the data. A scraper using requests + BeautifulSoup doesn't run any scripts, so it usually receives only the page template instead of the final content.

In practice you have three main options:

  • Call the site's hidden API directly. Many JavaScript pages fetch their data through XHR or fetch requests. If you find that endpoint in the Network tab, you can request the JSON directly from Python.
  • Render the page in a real browser. Tools like Selenium or Playwright load the page, run the scripts, and then allow you to scrape the rendered DOM.
  • Use a rendering API. Services like web scraping API run the browser and proxies for you and return the HTML or JSON.

Once the page is rendered (or the API is called), you can parse the data normally using Python tools like BeautifulSoup or CSS selectors. Detailed explanations and examples for each approach are shown below.

Why JavaScript pages break your usual Python web scraping

Sooner or later every scraper runs into the same confusing moment. You open a site in your browser and clearly see products, prices, or comments. Then your scraper runs... and returns almost nothing. Just a bit of HTML structure with no real data.

This happens all the time when doing Python web scraping JavaScript pages with requests and BeautifulSoup. But why?

The page that JavaScript built

Many modern sites don't send the final content right away. The server returns a minimal HTML skeleton first, and JavaScript fills it with data afterward. This approach is called client-side rendering: the browser fetches data from APIs and injects it into the layout. That's what enables features like instant search results, dynamic filters, and infinite scrolling.

Your browser runs those scripts automatically, so everything looks normal. But a basic scraper using requests only downloads the initial HTML. Since it doesn't execute JavaScript, it often receives just the empty template instead of the final content.

A quick way to confirm this is disabling JavaScript in the browser and reloading the page. If the content disappears or the page turns into a bare layout, you're probably looking at a JS-rendered site. And that's exactly what a simple scraper will see.

💡 If you want a deeper explanation of this situation, check our guide: Scraper does not see the data I see.

How JavaScript changes the page after first load

Let's walk through a simple example. Imagine a product page in an online store. When the page loads, a few things happen behind the scenes before you actually see the products.

Step 1 — The browser loads the base HTML

First, the browser requests the page and the server returns a minimal HTML document. It contains the layout and a few empty containers where the real content will appear later.

The HTML might include things like:

  • the page header
  • a product grid container
  • JavaScript files

But no actual product data yet. It can look something like this:

<script src="app.js"></script>
<div id="products"></div>

Step 2 — JavaScript starts running

Once the HTML is loaded, the browser executes the JS code. Such scripts often make a request to the site's API to fetch the data. For example:

GET /api/products

Step 3 — The API returns real data

The API sends back structured data, usually JSON, containing the product information. Typical fields might include:

  • product name
  • price
  • rating
  • image URL

Step 4 — JavaScript builds the page

The script takes the returned data and drops it into the HTML container. The empty div fills up with product cards, and suddenly you see prices, images, and product names in the browser.

Step 5 — More data can appear later

Some sites go one step further and load additional content only when needed. For example, when you scroll down:

  • JS sends another API request
  • new products are returned
  • the page extends the list with more items

From the browser's perspective everything works perfectly. But your Python scraper only saw the original HTML template from step 1. And that, folks, is the root of many Python web scraping JavaScript headaches.

Simple checks to confirm the data is JavaScript-rendered

Before switching scraping strategies, it helps to confirm whether JavaScript is actually responsible for the missing data. A few quick checks can usually tell you right away. Here's a checklist.

1. Compare "View Source" and "Inspect"

Right-click the page in the browser and open View Page Source. Then search for the values you want, for example a product name or price. If they are missing from the source but visible on the page, that's a strong hint that JavaScript added them later.

Next, open the Inspector in DevTools and check the DOM. If the content appears there but not in the page source, it means it was injected after the page finished loading.

2. Watch the Network tab

Open DevTools, go to the Network tab, and reload the page. Look for requests labeled:

  • XHR
  • fetch

These requests usually return the data used to build the page. Quite often you'll see endpoints like:

  • /api/products
  • /api/search
  • /graphql

Click one of these requests and inspect the response. If you see product names, prices, or other data you're after, you've likely found the real data source behind the page.

3. Slow down the page load

Another quick trick is to slow down the connection. In DevTools, open the Network tab, enable something like Slow 3G, then reload the page. If you see elements appearing gradually (for example product cards popping in one after another) that's a good sign JavaScript is building the page after the initial HTML loads.

Once you run through these checks, you can usually tell pretty quickly whether you're dealing with a normal static page or a Python web scraping JavaScript situation.

Choosing a Python approach for JavaScript-heavy websites

So, after you've confirmed a site relies on JavaScript, the next question is obvious: how do you actually scrape it?

There isn't a single "correct" way to handle Python web scraping JavaScript pages. The best approach depends on the site, how complex it is, and how much infrastructure you want to manage. In practice, most scraping setups fall into three main categories:

  • calling the site's hidden JSON APIs directly
  • running a headless browser that executes JavaScript
  • using a web scraping API that renders the page for you

All three approaches solve the same problem, just in different ways. The trick is choosing the one that fits your workflow and the project scale.

If you are exploring the Python ecosystem in general, check out this guide: Best framework for web scraping with Python.

Before jumping into heavier tools like headless browsers, it might be worth trying the simplest option first.

When a simple HTTP client is enough

In many cases, JS-based sites still expose their data through normal API requests behind the scenes. Before jumping to heavier tools, it's worth checking the Network tab in DevTools. Reload the page and look for requests labeled XHR or fetch. These often return the same data used to build the page. If you find the right endpoint, you can call it directly from Python with requests instead of scraping the rendered HTML.

The workflow usually looks like this:

  1. Open DevTools → Network tab
  2. Reload the page
  3. Find the request that returns the data
  4. Copy the request details (URL, method, headers, and payload if needed)
  5. Recreate the request in Python

Tip: In most browsers you can right-click the request and choose Copy → Copy as cURL. Then paste it into the curl converter to quickly generate a Python request you can adapt for your scraper.

Now your scraper talks directly to the source instead of trying to reconstruct it from the page.

This approach has a few nice advantages:

  • it's much faster than rendering the whole page
  • you don't need to run a browser
  • the response is usually structured data (JSON, XML, etc.)
  • it's often lighter for the site since you're not repeatedly loading full pages

When you need a headless browser or rendering API

Sometimes the API shortcut simply isn't there. Some sites build most of the page directly in the browser, so the content only appears after several scripts run.

You'll often see this with:

  • content injected dynamically into the DOM
  • complex login flows
  • pages that require clicks or form submissions
  • infinite scrolling feeds
  • sites that hide their requests behind heavily obfuscated scripts

In these situations you need something that can actually execute JavaScript. Two common Python-friendly tools are:

  • Selenium
  • Playwright

Both launch a real browser behind the scenes (usually in headless mode). Your script opens the page, lets the scripts run, and then extracts the content from the rendered DOM.

Another option is using a web scraping API that handles the rendering for you. These services load the page on their own servers, run the JS, and return the final HTML so your Python code stays simple. For example, tools like ScrapingBee can render pages remotely and send back the result.

The idea is straightforward: if the content only exists after JavaScript runs, your scraper needs an environment where those scripts can run too.

Step-by-step: Scrape JavaScript content with Selenium in Python

When a site builds its content with JavaScript, a requests + BeautifulSoup scraper will often see only the empty HTML template. In that case you need a browser environment that can run the scripts first. That's exactly what Selenium provides. Your Python code launches a browser, opens the page, waits for the content to load, and then extracts the data from the DOM.

The typical workflow looks like this:

  1. Install Selenium
  2. Start a browser from Python
  3. Open the target page
  4. Wait for the dynamic content to appear
  5. Extract the data from the DOM
  6. Optionally parse the HTML with BeautifulSoup

If you want a deeper introduction to the tool itself, this guide is a good companion: How to web scrape with Python Selenium.

In the example below we'll scrape a dynamically displayed table from this demo site: scrapethissite.com/pages/ajax-javascript/#2015. The page shows the 2015 Oscar nominees — loaded by JavaScript, not nominated by it.

Selenium setup and waiting for dynamic elements

First install Selenium:

pip install selenium

In older tutorials you had to manually download ChromeDriver or GeckoDriver. With modern Selenium versions, this is usually no longer necessary. Selenium Manager automatically downloads the correct driver when your script runs, which makes the setup nicer.

So the basic code to launch a browser and open a page can look like this:

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("https://www.scrapethissite.com/pages/ajax-javascript/#2015")

Once the page opens, the next challenge is timing. JavaScript pages often load content asynchronously, so if your scraper tries to extract data immediately, the elements might not exist yet. Instead of scraping right away, it's better to wait for a specific element that indicates the page finished rendering.

In practice you'll need to inspect the page and choose something stable, like a table row, product card, or other element that appears only after the data loads. For more complex pages you can also rely on broader waiting strategies, such as waiting for network activity to settle or adding a short delay before scraping.

💡 Tip: Avoid fixed delays when possible! Using a simple time.sleep() may seem convenient, but it's rarely reliable. Page load times vary depending on the network and server response, so the delay will either be too short (and the elements aren't there yet) or too long (and your script just sits there doing nothing).

For the Oscar table on our demo page, we can wait for rows with the class .film.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = None  # define driver early so we can safely close it in the finally block

try:
    # Start a Chrome browser instance (Selenium Manager will fetch the driver if needed)
    driver = webdriver.Chrome()

    # Open the target page that loads Oscar nominees via JavaScript
    driver.get("https://www.scrapethissite.com/pages/ajax-javascript/#2015")

    # Create an explicit wait helper (max 10 seconds)
    wait = WebDriverWait(driver, 10)

    # Wait until the dynamic table rows appear in the DOM
    # This ensures JavaScript finished loading the data
    rows = wait.until(
        EC.presence_of_all_elements_located((By.CSS_SELECTOR, "tr.film"))
    )

This tells Selenium to wait up to 10 seconds until at least one film row appears in the DOM. In most cases the script continues much sooner, as soon as the element appears. Here the 10-second value acts as a safety timeout. If the element never loads (for example due to a layout change or a network issue), Selenium stops waiting instead of hanging forever.

If you want more examples of extracting data this way, check this guide: How to extract data from website using Selenium Python.

Extracting the data with Selenium selectors

Once the rows are present in the DOM, you can loop through them and extract the data using Selenium selectors. Each row in this table contains several cells, for example:

  • .film-title
  • .film-nominations
  • .film-awards
  • .film-best-picture

Here is a simple scraper that pulls the main fields:

# ... imports ...

driver = None  # define driver early so we can always close it safely

try:
    # Start the browser
    driver = webdriver.Chrome()

    # Note: to run Selenium without opening a visible browser window,
    # enable Chrome headless mode.
    #
    # from selenium.webdriver.chrome.options import Options
    # options = Options()
    # options.add_argument("--headless=new")
    # driver = webdriver.Chrome(options=options)
    #
    # In recent Chrome versions, "--headless" often maps to the "new" mode,
    # but "--headless=new" makes the intent explicit.

    # Open the demo page that loads Oscar nominees dynamically
    driver.get("https://www.scrapethissite.com/pages/ajax-javascript/#2015")

    # Wait helper (max 10 seconds)
    wait = WebDriverWait(driver, 10)

    # Wait until the film rows appear in the DOM
    # This ensures the JavaScript content has finished loading
    rows = wait.until(
        EC.presence_of_all_elements_located((By.CSS_SELECTOR, "tr.film"))
    )

    films = []  # container for parsed results

    # Loop through each table row and extract the fields we want
    for row in rows:
        try:
            title = row.find_element(By.CSS_SELECTOR, ".film-title").text.strip()
            nominations = row.find_element(By.CSS_SELECTOR, ".film-nominations").text
            awards = row.find_element(By.CSS_SELECTOR, ".film-awards").text

            # Store structured data instead of printing immediately
            films.append({
                "title": title,
                "nominations": int(nominations),
                "awards": int(awards)
            })

        except Exception as e:
            # Skip rows that fail to parse (helps avoid crashes if the page structure changes)
            print(f"Skipping row due to parsing error: {e}")

    # Print results in a readable format
    for film in films:
        print(
            f"- {film['title']}: "
            f"{film['nominations']} nominations, "
            f"{film['awards']} awards"
        )

except TimeoutException:
    # Happens if the film rows never appear within the wait time
    print("Timed out waiting for the film rows to load.")

except WebDriverException as e:
    # Handles browser startup or driver issues
    print(f"WebDriver error: {e}")

finally:
    # Always close the browser to free resources
    if driver:
        driver.quit()

We iterate over each row and extract the text from the required elements. Selenium enables us to target them with CSS selectors, just like you would in browser DevTools.

Let's check the result. And the Oscar goes to...

- Spotlight: 6 nominations, 2 awards
- Mad Max: Fury Road: 10 nominations, 6 awards
- The Revenant: 12 nominations, 3 awards

... other films ...

Nice!

Parsing rendered HTML with BeautifulSoup

Another common pattern is to let Selenium handle the JavaScript, then use BeautifulSoup to parse the rendered HTML. Once the page has fully loaded, you can grab the final DOM like this:

html = driver.page_source

Then pass it to BeautifulSoup:

# Install with:
# pip install beautifulsoup4
from bs4 import BeautifulSoup

soup = BeautifulSoup(html, "html.parser")

rows = soup.select("tr.film")

for row in rows:
    title = row.select_one(".film-title").get_text(strip=True)
    nominations = row.select_one(".film-nominations").get_text(strip=True)
    awards = row.select_one(".film-awards").get_text(strip=True)

    print(title, nominations, awards)

This pattern is very common in Python web scraping JavaScript workflows. In other words:

  • Selenium renders the page
  • BeautifulSoup extracts the data

If you want more examples and tips around BeautifulSoup selectors, you can browse this collection: BeautifulSoup web scraping questions.

Small pitfalls to watch for

JavaScript pages don't always behave like "normal" HTML pages. If your scraper returns empty values or misses elements that you clearly see in the browser, the issue is usually related to dynamic rendering or page timing.

Here are a few common things that trip scrapers up:

  • Elements appear only after interaction. Some content loads only after scrolling, clicking a button, or switching tabs.
  • Hidden template elements. Some pages keep HTML templates in the DOM that stay hidden (display: none). JavaScript clones these templates to build the real content, so your selector might match the template instead of the actual element.
  • Loading placeholders. Many sites show skeleton loaders before the real content appears. If your scraper runs too early, it may capture those placeholders instead of the final data.
  • Elements get replaced dynamically. Some frameworks rebuild parts of the DOM. An element you located earlier might disappear and be replaced by a new one.
  • Hidden honeypot elements. Occasionally pages include hidden links or fields meant to trap bots. Real users never interact with them, but careless scrapers might.

When something looks wrong, inspect the page again in DevTools and make sure your selectors match the final rendered elements, not hidden templates or temporary placeholders.

It's also useful to check what Selenium actually sees. Printing driver.page_source or inspecting elements through Selenium can quickly show whether the page finished rendering before your scraper tried to extract the data.

Scraping JavaScript pages with Playwright in Python

Selenium is a popular choice for browser automation, but it’s not the only option. Another modern tool many developers use is Playwright. Playwright was originally developed by Microsoft and is designed for reliable browser automation. Like Selenium, it launches a real browser, loads the page, runs the JavaScript, and lets you extract the rendered content.

One advantage of Playwright is that it handles many modern browser features smoothly out of the box, including automatic waiting for elements and better support for dynamic pages.

Installing Playwright

First install the Python package:

pip install playwright

Then install the browser binaries that Playwright uses:

playwright install

This command downloads the supported browsers (Chromium, Firefox, and WebKit) so Playwright can run them locally.

Example: Scraping a JavaScript page with Playwright

The workflow is very similar to Selenium: open the page, wait for the dynamic content to appear, then extract the data. Here’s an example using the same demo site that loads Oscar nominees dynamically:

from playwright.sync_api import sync_playwright, TimeoutError as PlaywrightTimeoutError

url = "https://www.scrapethissite.com/pages/ajax-javascript/#2015"

try:
    # Start Playwright
    with sync_playwright() as p:

        # Launch a headless Chromium browser
        # headless=True means no visible browser window is opened
        browser = p.chromium.launch(headless=True)

        # Create a new browser tab (page)
        page = browser.new_page()

        # Navigate to the target page
        page.goto(url)

        # Wait until the film rows appear in the DOM
        # This ensures the JavaScript content finished loading
        page.wait_for_selector("tr.film", timeout=10000)

        # Select all film rows
        rows = page.query_selector_all("tr.film")

        films = []

        # Loop through each row and extract the data
        for row in rows:
            try:
                title = row.query_selector(".film-title").inner_text().strip()
                nominations = int(row.query_selector(".film-nominations").inner_text())
                awards = int(row.query_selector(".film-awards").inner_text())

                films.append({
                    "title": title,
                    "nominations": nominations,
                    "awards": awards
                })

            except Exception as e:
                # Skip rows that fail to parse
                print(f"Skipping row due to parsing error: {e}")

        # Print the results in a readable format
        for film in films:
            print(
                f"- {film['title']}: "
                f"{film['nominations']} nominations, "
                f"{film['awards']} awards"
            )

        # Close the browser to free resources
        browser.close()

except PlaywrightTimeoutError:
    print("Timed out waiting for the film rows to load.")

except Exception as e:
    print(f"Unexpected error: {e}")

This script launches a headless Chromium browser, opens the page, waits until the table rows appear, and then extracts the film data. The overall idea is the same as with Selenium: let the browser run the JavaScript first, then scrape the final DOM once the content is available.

Scale your Python JavaScript scraping safely

Getting a Selenium/Playwright script to work is one thing. Running it at scale without getting blocked is a different challenge.

JS-powered sites usually monitor traffic more closely. They often apply stricter rate limits, bot detection, and behavior analysis. On top of that, headless browsers consume much more CPU and memory than simple HTTP requests, so scaling too aggressively can break both your scraper and the target site.

If you plan to scrape many pages or run jobs regularly, a few basic practices can save you a lot of trouble.

Respect robots.txt and the site's limits

Before scraping a site heavily, it's worth checking its robots.txt file. It tells crawlers which parts of the site should or shouldn't be accessed automatically. It isn't a strict technical barrier, but it gives a good idea of what the site expects from automated traffic.

You can usually find it at:

https://example.com/robots.txt

Also pay attention to how fast your scraper sends requests. Even if your code works perfectly, firing dozens of requests per second will eventually trigger rate limits or blocking.

A good rule of thumb: start slow and increase the speed only if needed.

Add delays and avoid aggressive scraping

One of the simplest ways to reduce blocking risk is adding delays between requests. This makes your scraper behave more like a real user and reduces pressure on the server.

For example:

import time

time.sleep(2)

You can also randomize delays so the pattern looks less robotic:

import random
import time

time.sleep(random.uniform(1, 3))

Handle CAPTCHAs and anti-bot systems

Sooner or later you may hit CAPTCHAs or other bot protection systems. This is common on modern sites, especially JavaScript-heavy ones. Typical warning signs include:

  • requests suddenly returning empty or partial pages
  • repeated login prompts
  • Cloudflare or other anti-bot verification screens

When this happens, simply retrying requests usually won't help. It often means the site has flagged your traffic. In practice you may need to slow the scraper down, rotate IP addresses, or adjust how your crawler behaves.

Rotate IPs if you scrape large volumes

If many requests come from the same IP address, the site may eventually throttle or block them. A common solution is using proxies or rotating IPs so each request appears to come from a different location. This spreads the traffic across multiple addresses instead of concentrating it on one.

This becomes especially useful when scraping large datasets or crawling many pages in parallel.

Cache results whenever possible

If your scraper runs regularly, caching can save both bandwidth and time. Instead of downloading the same pages repeatedly, store responses locally and reuse them when possible. This is especially helpful during testing or debugging.

Benefits include:

  • fewer requests sent to the site
  • faster scraping runs
  • lower risk of triggering rate limits

It also lets you replay saved responses instead of hitting the site every time.

Consider using a managed scraping API

Running browsers, managing proxies, and dealing with bot protection can quickly turn into infrastructure work. One alternative is using a managed web scraping API that handles these problems for you. These services load pages on their own servers, run JavaScript, rotate IPs, and return the rendered HTML or extracted data.

For example, tools like ScrapingBee can render JavaScript remotely and send the final HTML back to your Python script. Your scraper simply makes an API request and receives the rendered page. This approach is especially useful when scraping JS-based sites at scale.

Start scraping JavaScript pages with Python today

By now you've seen the main ways developers deal with Python web scraping JavaScript pages.

  • Sometimes the easiest solution is calling the site's hidden API directly. If the page loads data through XHR or fetch requests, you can often skip the HTML entirely and request the endpoint with Python.
  • In other cases the content only appears after JavaScript runs. Then you need a browser environment that can execute the scripts. Tools like Selenium or Playwright launch a real browser, let the page render, and allow your scraper to extract the final content from the DOM.
  • As scraping projects grow, infrastructure often becomes the bigger challenge. Running browsers, managing proxies, handling rate limits, and dealing with anti-bot protection all take time and resources. One way to simplify this is using a managed web scraping API. Instead of maintaining browsers and proxy pools yourself, you send a request to the API and it loads the page on its own servers, executes the JavaScript, and returns the resulting HTML or JSON.

The key point is that you don't need a perfect setup to begin. Start small: pick a dynamic page, inspect the network requests, and build a simple scraper. Even a short script that extracts a table or product list is enough to understand the workflow.

Once your project grows, you can move to a more robust setup. For example, instead of running and scaling headless browsers yourself, tools like ScrapingBee can render the page remotely and return the fully rendered HTML directly to your Python code.

That way your scraper stays simple while still working with modern JavaScript-powered websites.

If you want to try this approach, grab an API key and test it on your first dynamic page. It only takes a few lines of Python to render a page and get the data back, without managing browsers or infrastructure yourself.

Example: Scraping a JavaScript page with ScrapingBee

So, let's see this in action. First install the SDK and BeautifulSoup:

pip install scrapingbee beautifulsoup4

Create a free ScrapingBee account and copy your API key from the dashboard. The free trial includes 1,000 credits and does not require a credit card, so you can get started in a minute. Then store it in an environment variable:

export SCRAPINGBEE_API_KEY="YOUR_API_KEY"

Alternatively, use an .env file.

Now let's request the same Oscar demo page used earlier. ScrapingBee will render the JavaScript and return the final HTML:

import os
from scrapingbee import ScrapingBeeClient
from bs4 import BeautifulSoup

client = ScrapingBeeClient(api_key=os.getenv("SCRAPINGBEE_API_KEY"))

response = client.get(
    "https://www.scrapethissite.com/pages/ajax-javascript/#2015",
    params={
        "render_js": True,     # run JavaScript before returning HTML
        "wait_for": "tr.film", # wait until the Oscar rows appear
        "wait": 1500           # small delay (milliseconds)
        # ScrapingBee supports many additional options depending on your needs. For example:
        #
        # premium_proxy=True      → stronger proxies for difficult sites
        # country_code="us"       → geo-target requests
        # js_scenario={}          → simulate clicks or scrolling
        # screenshot=True         → capture a screenshot of the page
    }
)

soup = BeautifulSoup(response.content, "html.parser")
rows = soup.select("tr.film")

films = []

for row in rows:
    title = row.select_one(".film-title").get_text(strip=True)
    nominations = row.select_one(".film-nominations").get_text(strip=True)
    awards = row.select_one(".film-awards").get_text(strip=True)

    films.append({
        "title": title,
        "nominations": int(nominations),
        "awards": int(awards),
    })

for film in films:
    print(
        f"- {film['title']}: "
        f"{film['nominations']} nominations, "
        f"{film['awards']} awards"
    )

That's it! Your script sends one request to the API, ScrapingBee handles the page in a browser environment, runs the JavaScript, and returns the resulting HTML.

Frequently asked questions (FAQs)

Is Python or JavaScript better for scraping JavaScript-heavy websites?

Both languages work well for scraping JS-heavy websites. Python is usually easier to start with thanks to libraries like Requests, BeautifulSoup, Selenium, and Playwright. JavaScript integrates naturally with browser environments, but Python remains the most widely used option for data scraping workflows.

See: Which is better for web scraping: Python or JavaScript

Can I extract dynamic content using Selenium with Python?

Yes. Selenium launches a real browser, loads the page, and runs JavaScript just like a normal user would. Once the page finishes rendering, you can extract data from the DOM using Selenium selectors or parse the HTML with BeautifulSoup.

See: How to extract data from a website using Selenium Python

How can I scrape JavaScript-heavy Google results with Python?

Google search results rely heavily on JavaScript and strong bot protection. A common approach is using a scraping API that handles rendering, IP rotation, and anti-bot measures for you. Your Python code simply requests the results instead of managing browsers and proxies.

See: Google scraping features

What if my Python scraper still does not see the same data I see in the browser?

This usually means the content loads after the initial HTML response. Check the Network tab in DevTools for XHR or fetch requests, and confirm whether JavaScript is rendering the page. If needed, switch to a headless browser or a rendering API so your scraper can access the fully loaded content.

See: Scraper does not see the data I see

image description
Ilya Krukowski

Ilya is an IT tutor and author, web developer, and ex-Microsoft/Cisco specialist. His primary programming languages are Ruby, JavaScript, Python, and Elixir. He enjoys coding, teaching people and learning new things. In his free time he writes educational posts, participates in OpenSource projects, tweets, goes in for sports and plays music.