Documentation - JavaScript Scenario

Interact with the webpage you want to scrape.

You can also discover this feature using our Postman collection covering every ScrapingBee's features.


💡 Important:
This page explains how to use a specific feature of our main web scraping API!
If you are not yet familiar with ScrapingBee web scraping API, you can read the documentation here.

Basic usage

If you want to interact with pages you want to scrape before we return your the HTML you can add JavaScript scenario to your API call.

For example, if you wish to click on a button, you will need to use this scenario.

{
    "instructions": [
       {"click": "#buttonId"}
    ]
}

And so our scraper will scrape the webpage, click on the button #buttonId and then return you the HTML of the page.

Important: JavaScript scenario are JSON formatted, and in order to pass them to a GET request, you need to stringify them.

# Install the Python ScrapingBee library:
# pip install scrapingbee

from scrapingbee import ScrapingBeeClient

client = ScrapingBeeClient(api_key='YOUR-API-KEY')
response = client.get(
    'https://www.scrapingbee.com/blog',
    params={
        'js_scenario': {"instructions": [{ "click": "#buttonId" }]},
    },
)
print('Response HTTP Status Code: ', response.status_code)
print('Response HTTP Response Body: ', response.content)
// request Axios
const axios = require('axios');

axios.get('https://app.scrapingbee.com/api/v1', {
    params: {
        'api_key': 'YOUR-API-KEY',
        'url': 'https://www.scrapingbee.com/blog',
        'js_scenario': '{"title":"h1","subtitle":"#subtitle"}',
    }
}).then(function (response) {
    // handle success
    console.log(response);
})
String encoded_url = URLEncoder.encode("YOUR URL", "UTF-8");
require 'net/http'
require 'net/https'
require 'uri'

# Classic (GET )
def send_request
    extract_rules = URI::encode('{"instructions": [{ "click": "#buttonId" }]}')
    uri = URI('https://app.scrapingbee.com/api/v1/?api_key=YOUR-API-KEY&url=https://www.scrapingbee.com/blog&js_scenario=' + extract_rules)

    # Create client
    http = Net::HTTP.new(uri.host, uri.port)
    http.use_ssl = true
    http.verify_mode = OpenSSL::SSL::VERIFY_PEER

    # Create Request
    req =  Net::HTTP::Get.new(uri)

    # Fetch Request
    res = http.request(req)
    puts "Response HTTP Status Code: #{ res.code }"
    puts "Response HTTP Response Body: #{ res.body }"
rescue StandardError => e
    puts "HTTP Request failed (#{ e.message })"
end

send_request()
<?php

// get cURL resource
$ch = curl_init();

// set url
$extract_rules = urlencode('{"instructions": [{ "click": "#buttonId" }]}');

curl_setopt($ch, CURLOPT_URL, 'https://app.scrapingbee.com/api/v1/?api_key=YOUR-API-KEY&url=https://www.scrapingbee.com/blog&js_scenario=' . $extract_rules);

// set method
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');

// return the transfer as a string
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);



// send the request and save response to $response
$response = curl_exec($ch);

// stop if fails
if (!$response) {
  die('Error: "' . curl_error($ch) . '" - Code: ' . curl_errno($ch));
}

echo 'HTTP Status Code: ' . curl_getinfo($ch, CURLINFO_HTTP_CODE) . PHP_EOL;
echo 'Response Body: ' . $response . PHP_EOL;

// close curl resource to free up system resources
curl_close($ch);

?>
package main

import (
	"fmt"
	"io/ioutil"
	"net/http"
    "net/url"
)

func sendClassic() {
	// Create client
	client := &http.Client{}


    // Stringify rules
    extract_rules := url.QueryEscape(`{"instructions": [{ "click": "#buttonId" }]}`)
	// Create request
	req, err := http.NewRequest("GET", "https://app.scrapingbee.com/api/v1/?api_key=YOUR-API-KEY&url=https://www.scrapingbee.com/blog&js_scenario=" + extract_rules, nil)


	parseFormErr := req.ParseForm()
	if parseFormErr != nil {
		fmt.Println(parseFormErr)
	}

	// Fetch Request
	resp, err := client.Do(req)

	if err != nil {
		fmt.Println("Failure : ", err)
	}

	// Read Response Body
	respBody, _ := ioutil.ReadAll(resp.Body)

	// Display Results
	fmt.Println("response Status : ", resp.Status)
	fmt.Println("response Headers : ", resp.Header)
	fmt.Println("response Body : ", string(respBody))
}

func main() {
    sendClassic()
}

You can add multiple instructions to the scenario, they will get executed one by one on our end.

Below is a quick overview of all the different instruction you can use.

{"evaluate": "console.log('foo')"} # Run custom JavaScript
{"click": "#button_id"} # Click on a an element
{"wait": 1000} # Wait for a fixed duration in ms
{"wait_for": "#slow_div"} # Wait for a css element to appear
{"wait_for_and_click": "#slow_div"} # Wait for a css element to appear and then click on it
{"scroll_x": 1000} # Scroll the screen in the horizontal axis, in px
{"scroll_y": 1000} # Scroll the screen in the vertical axis, in px
{"fill": ["#input_1", "value_1"]} # Fill some input
{"evaluate": "console.log('toto')"} # Run custom JavaScript code

Of course you can choose to use them in the order you want, and you can use the same one multiple time in one scenario.

Here is an example of a scenario that wait for a button to appear, click on it and then scroll, wait a bit, and scroll again.

{
    "instructions": [
        {"wait_for_and_click": "#slow_button"},
        {"scroll_x": 1000},
        {"wait": 1000},
        {"scroll_x": 1000},
        {"wait": 1000},
    ]
}


Clicking on a button

click CSS selector

To click on a button, use this instruction with the CSS selector of the button you want to click on

If you want to click on the button whose id is secretButton you need to use this JavaScript scenario:

{
    "instructions": [
        {"click": "#secretButton"}
    ]
}


Wait for a fixed amount of time

wait duration in ms

To wait for a fixed amount of time, use this instruction with the duration, in ms, you want to wait for.

If you want to wait for 2 seconds, you need to use this JavaScript scenario:

{
    "instructions": [
        {"wait": 2000}
    ]
}


Wait for an element to appear

wait_for CSS selector

To wait for a particular element to appear, use this instruction with the CSS selector of the element you want to wait for.

If you want to wait for the element whose class is slow_div to appear before getting some results, you need to use this JavaScript scenario:

{
    "instructions": [
        {"wait_for": ".slow_div"}
    ]
}


Wait for an element to appear and click

wait_for_and_click CSS selector

To wait for a particular element to appear, and then click on it, use this instruction.

​​​​​If you want to wait for the element whose class is slow_div to appear before clicking on it, you need to use this JavaScript scenario:

{
    "instructions": [
        {"wait_for_and_click": ".slow_div"}
    ]
}

Note: this is exactly the same as using:

{
    "instructions": [
        {"wait_for": ".slow_div"},
        {"click": ".slow_div"}
    ]
}

​​



Scroll Horizontally

scroll_x number of pixel

To scroll horizontally on a page, use this instruction with the number of pixels you want to scroll.

If you want to scroll down 1000px you need to use this JavaScript scenario:

{
    "instructions": [
        {"scroll_x": 1000}
    ]
}


Scroll Vertically

scroll_y number of pixel

To scroll vertically on a page, use this instruction with the number of pixels you want to scroll.

If you want to scroll down 1000px you need to use this JavaScript scenario:

{
    "instructions": [
        {"scroll_y": 1000}
    ]
}


Filling form input

fill [ CSS selector, value ]

To fill an input, use this instruction with the CSS selector of the input you want to fill and the value you want to fill it with.

If you want to fill an input whose CSS selector is #input_1 with the value value_1 you need to use this JavaScript scenario:

{
    "instructions": [
        {"fill": ["input_1", "value_1"]}
    ]
}


Executing custom JavaScript

evaluate JavaScript code

If you need more flexibility and need to run custom JavaScript, you need to use this instruction.

If you want to run the code console.log('foo') on the webpage you need to use this JavaScript scenario:

{
    "instructions": [
        {"evaluate": "console.log('foo')"}
    ]
}

💡 Good to know: The results of any evaluate instruction will be added to the evaluate_results key in the JSON response if json_response=True is used. You can read more about this here

Timeout

Your whole scenario should not take more than 40 seconds to complete, otherwise the API will timeout.