Documentation - JavaScript Scenario

Interact with the webpage you want to scrape.

Basic usage

If you want to interact with pages you want to scrape before we return your the HTML you can add JavaScript scenario to your API call.

For example, if you wish to click on a button, you will need to use this scenario.

{
    "instructions": [
       {"click": "#buttonId"}
    ]
}

And so our scraper will scrape the webpage, click on the button #buttonId and then return you the HTML of the page.

Important: JavaScript scenario are JSON formatted, and in order to pass them to a GET request, you need to stringify them.

Here is how to do it in your favorite language.

# Install the Python ScrapingBee library:
# pip install scrapingbee

from scrapingbee import ScrapingBeeClient

client = ScrapingBeeClient(api_key='YOUR-API-KEY')
response = client.get(
    'https://www.scrapingbee.com/blog',
    params={
        'js_scenario': {"instructions": [{ "click": "#buttonId" }]},
    },
)
print('Response HTTP Status Code: ', response.status_code)
print('Response HTTP Response Body: ', response.content)

// request Axios
const axios = require('axios');

axios.get('https://app.scrapingbee.com/api/v1', {
    params: {
        'api_key': 'YOUR-API-KEY',
        'url': 'https://www.scrapingbee.com/blog',
        'js_scenario': '{"title":"h1","subtitle":"#subtitle"}',
    }
}).then(function (response) {
    // handle success
    console.log(response);
})

require 'net/http'
require 'net/https'
require 'uri'

# Classic (GET )
def send_request
    extract_rules = URI::encode('{"instructions": [{ "click": "#buttonId" }]}')
    uri = URI('https://app.scrapingbee.com/api/v1/?api_key=YOUR-API-KEY&url=https://www.scrapingbee.com/blog&js_scenario=' + extract_rules)

    # Create client
    http = Net::HTTP.new(uri.host, uri.port)
    http.use_ssl = true
    http.verify_mode = OpenSSL::SSL::VERIFY_PEER

    # Create Request
    req =  Net::HTTP::Get.new(uri)

    # Fetch Request
    res = http.request(req)
    puts "Response HTTP Status Code: #{ res.code }"
    puts "Response HTTP Response Body: #{ res.body }"
rescue StandardError => e
    puts "HTTP Request failed (#{ e.message })"
end

send_request()

<?php

// get cURL resource
$ch = curl_init();

// set url
$extract_rules = urlencode('{"instructions": [{ "click": "#buttonId" }]}');

curl_setopt($ch, CURLOPT_URL, 'https://app.scrapingbee.com/api/v1/?api_key=YOUR-API-KEY&url=https://www.scrapingbee.com/blog&js_scenario=' . $extract_rules);

// set method
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');

// return the transfer as a string
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);



// send the request and save response to $response
$response = curl_exec($ch);

// stop if fails
if (!$response) {
  die('Error: "' . curl_error($ch) . '" - Code: ' . curl_errno($ch));
}

echo 'HTTP Status Code: ' . curl_getinfo($ch, CURLINFO_HTTP_CODE) . PHP_EOL;
echo 'Response Body: ' . $response . PHP_EOL;

// close curl resource to free up system resources
curl_close($ch);

?>

package main

import (
	"fmt"
	"io/ioutil"
	"net/http"
    "net/url"
)

func sendClassic() {
	// Create client
	client := &http.Client{}


    // Stringify rules
    extract_rules := url.QueryEscape(`{"instructions": [{ "click": "#buttonId" }]}`)
	// Create request
	req, err := http.NewRequest("GET", "https://app.scrapingbee.com/api/v1/?api_key=YOUR-API-KEY&url=https://www.scrapingbee.com/blog&js_scenario=" + extract_rules, nil)


	parseFormErr := req.ParseForm()
	if parseFormErr != nil {
		fmt.Println(parseFormErr)
	}

	// Fetch Request
	resp, err := client.Do(req)

	if err != nil {
		fmt.Println("Failure : ", err)
	}

	// Read Response Body
	respBody, _ := ioutil.ReadAll(resp.Body)

	// Display Results
	fmt.Println("response Status : ", resp.Status)
	fmt.Println("response Headers : ", resp.Header)
	fmt.Println("response Body : ", string(respBody))
}

func main() {
    sendClassic()
}

You can add multiple instructions to the scenario, they will get executed one by one on our end.

Below is a quick overview of all the different instruction you can use.

{"evaluate": "console.log('foo')"} # Run custom JavaScript
{"click": "#button_id"} # Click on a an element
{"wait": 1000} # Wait for a fixed duration in ms
{"wait_for": "#slow_div"} # Wait for a css element to appear
{"scroll_x": 1000} # Scroll the screen in the horizontal axis, in px
{"scroll_y": 1000} # Scroll the screen in the vertical axis, in px
{"fill": ["#input_1", "value_1"]} # Fill some input
{"evaluate": "console.log('toto')" # Run custom JavaScript code

Of course you can choose to use them in the order you want, and you can use the same one multiple time in one scenario.

Here is an example of a scenario that wait for a button to appear, click on it and then scroll, wait a bit, and scroll again.

{
    "instructions": [
        {"wait_for": "#slow_button"},
        {"click": "#slow_button"},
        {"scroll_x": 1000},
        {"wait": 1000},
        {"scroll_x": 1000},
        {"wait": 1000},
    ]
}

Clicking on a button: {"click": "<CSS selector>"}

To click on a button, use this instruction with the CSS selector of the button you want to click on

If you want to click on the button whose id is secretButton you need to use this JavaScript scenario:

{
    "instructions": [
        {"click": "#secretButton"}
    ]
}

Wait for a fixed amount of time: {"wait": "<duration in ms>"}

To wait for a fixed amount of time, use this instruction with the duration, in ms, you want to wait for.

If you want to wait for 2 seconds, you need to use this JavaScript scenario:

{
    "instructions": [
        {"wait": 2000}
    ]
}

Wait for an element to appear: {"wait_for": "<CSS selector>"}

To wait for a particular element to appear, use this instruction with the CSS selector of the element you want to wait for.

If you want to wait for the element whose class is slow_div to appear before getting some results, you need to use this JavaScript scenario:

{
    "instructions": [
        {"wait_for": ".slow_div"}
    ]
}

Scroll vertically: {"scroll_x": "<number of pixel>"}

To scroll vertically on a page, use this instruction with the number of pixels you want to scroll.

If you want to scroll down 1000px you need to use this JavaScript scenario:

{
    "instructions": [
        {"scroll_x": 1000}
    ]
}

Scroll horizontally: {"scroll_y": "<number of pixel>"}

To scroll horizontally on a page, use this instruction with the number of pixels you want to scroll.

If you want to scroll down 1000px you need to use this JavaScript scenario:

{
    "instructions": [
        {"scroll_y": 1000}
    ]
}

Filling form input: {"fill": "<CSS selector>" "<value>"}

To fill an input, use this instruction with the CSS selector of the input you want to fill and the value you want to fill it with.

If you want to fill an input whose CSS selector is #input_1 with the value value_1 you need to use this JavaScript scenario:

{
    "instructions": [
        {"fill": ["input_1", "value_1"]}
    ]
}

Executing custom JavaScript: {"evaluate": "<JavaScript code>"}

If you need more flexibility and need to run custom JavaScript, you need to use this instruction.

If you want to run the code console.log('foo') on the webpage you need to use this JavaScript scenario:

{
    "instructions": [
        {"evaluate": "console.log('foo')"}
    ]
}

Timeout

Your whole scenario should not take more than 40 seconds to complete, otherwise the API will timeout.