Data extraction in C#

One of the most important features of ScrapingBee, is the ability to extract exact data without need to post-process the request’s content using external libraries.

We can use this feature by specifying an additional parameter with the name extract_rules. We specify the label of elements we want to extract, their CSS Selectors and ScrapingBee will do the rest!

Let’s say that we want to extract the title & the subtitle of the data extraction documentation page. Their CSS selectors are h1 and span.text-20 respectively. To make sure that they’re the correct ones, you can use the JavaScript function: document.querySelector("CSS_SELECTOR") in that page’s developer tool’s console.

The full code will look like this:

using System;
using System.IO;
using System.Net;
using System.Web;
using System.Collections.Generic;

namespace test {
   class test{

      private static string BASE_URL = @"https://app.scrapingbee.com/api/v1/?";
      private static string API_KEY = "YOUR-API-KEY";

      public static string Get(string uri)
      {
            HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
            request.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;

            using(HttpWebResponse response = (HttpWebResponse)request.GetResponse())
            using(Stream stream = response.GetResponseStream())
            using(StreamReader reader = new StreamReader(stream))
            {
                return reader.ReadToEnd();
            }
      }

      public static void Main(string[] args) {

        var query = HttpUtility.ParseQueryString(string.Empty);
        query["api_key"] = API_KEY;
        query["url"] = @"https://scrapingbee.com/documentation/data-extraction";
        query["extract_rules"] = "{'title': 'h1', 'subtitle': 'span.text-20'}"; // JSON extract_rules data
        string queryString = query.ToString(); // Transforming the URL queries to string

        string output = Get(BASE_URL+queryString); // Make the request
        Console.WriteLine(output);

      }
   }
}

And as you can see, the result is: {"title": "Documentation - Data Extraction", "subtitle": "Extract data with CSS selector"}

You can find more about this feature in our documentation: Data Extraction. And more about CSS selectors in W3Schools - CSS Selectors page.

Go back to tutorials