How to Scrape Google Flights with Python and ScrapingBee

24 August 2025 | 11 min read

As the the key source of information on the internet, Google contains a lot of valuable public data. Just like with most industries, for many, it is the main source for tracking flight prices plus departure and arrival locations for trips.

As you already know, automation plays a vital role here, as everyone wants an optimal setup to compare multiple airlines and their pricing strategies to save money. Even better, collecting data with your own Google Flights scraper saves a lot of time and provides a consistent access to new deals.

If you're wondering how to scrape Google Flights, this guide will set you up for success with detailed steps, from setting up Python to exporting flight data into a CSV file. You’ll learn how to collect airline names, durations, and real-time prices using ScrapingBee’s API without worrying about proxies, captchas, or JavaScript rendering.

Quick Answer (TL;DR)

For those looking how to scrape flight search data, there is no better beginner-friendly solution than our robust Google Flights API. It handles proxies, JavaScript rendering, and anti-bot protection, while also taking care of scraped data parsing. Just use our Python SDK for GET API requests and provide CSS selectors so it can structure information to your liking – no need for Selenium or BeautifulSoup.

Below is a full code of a Google Flights scraper that we created for this tutorial. Feel free to copy it and tweak around with the settings. Once you get the hang of it, it should be pretty easy to scale it up with more URLs and other valuable flight data elements.

from scrapingbee import ScrapingBeeClient
import pandas as pd

client = ScrapingBeeClient(api_key='YOUR_API_KEY')
flight_urls = ["https://www.google.com/travel/flights/search?tfs=CBwQAhooEgoyMDI1LTA5LTI0agwIAxIIL20vMDE1NnFyDAgCEggvbS8wNGpwbBooEgoyMDI1LTA5LTI4agwIAhIIL20vMDRqcGxyDAgDEggvbS8wMTU2cUABSAFwAYIBCwj___________8BmAEB"]
def scrape_google_flights(flight_urls):

    extract_rules = {
        "flights": {
            "selector": "li.pIav2d",
            "type": "list",
            "output": {
                "price": "div.BVAVmf > div.YMlIz",
                "time": "div.gvkrdb.AdWm1c",
                "airline": "div.Ir0Voe > div.sSHqwe > span:not([class])"
            }
        }
    }

    js_scenario = {
        "instructions": [
            {"wait": 2000},
            {"evaluate": "window.scrollTo(0, document.body.scrollHeight);"},
            {"wait": 2000},
        ]
    }

    all_flights = []
    
    for url in flight_urls:
        response = client.get(
            url,
            params={
                'custom_google': 'True',
                "extract_rules": extract_rules,
                "js_scenario": js_scenario,
            },
            retries=2
        )

        flights = response.json().get('flights', [])
        all_flights.extend(flights)

        print(f"{url}: {response.status_code}, {len(flights)} flights extracted")

    df = pd.DataFrame(all_flights)
    df.to_csv("flights_data.csv", index=False)

# Example usage
scrape_google_flights(flight_urls)

Full Tutorial: Scraping Google Flights Step by Step

To build a Google Flights scraper from scratch, the first step is setting up your coding environment. Unlike traditional scraping setups that need multiple extra libraries or browser automation, ScrapingBee simplifies everything with one clean API.

Start by making sure you have a Python version 3.6 or newer on your machine. If you never used it before, head to the official Python site and follow the installation guide.

Note: Python gives you access to all the tools needed to send API requests and process the data you extract. By learning the ropes, you build transferable skills that will help you in future data collection processes.

Next, log in or sign up for a ScrapingBee account. If you're new, you can follow this guide using our free 1,000-credit trial!

Register

Once you're in, there will be a dashboard that will contain the information about your account. To use our API, you will need an API key, which you'll have to provide when initializing the ScrapingBee Python SDK in the script.

API key

Install ScrapingBee Python Client

Python scripts are defined in files with a .py extension. However, before we learn how to scrape Google Flights, make sure to install the necessary packages using pip – Python’s built-in package manager:

  • scrapingbee – Connects your script to our HTML API, letting you send requests and extract web data with ease.

  • pandas – Helps structure and manage the extracted data, making it easy to analyze or export as a CSV

Note: Web scrapers usually require additional libraries like "requests" to send HTTP requests and handles responses in Python, but our SDK replaces it by managing requests, rendering, and anti-bot handling through.

Go to your Terminal (or Command Prompt on Windows systems) and enter the following line:

pip install scrapingbee pandas

Prepare your Scraping Environment

Now, with all packages accessible on your system, choose a folder for your project and create a text file, for example "Google_flights.py". Here we will start defining the functions that will extract Google Flights data.

Note: Python scripts can be written on any text editor. However, we highly recommend using an IDE (Integrated Development Environment) like Visual Studio Code to learn syntax highlighting and benefit from live debuggers that will inform you about mistakes in the code before running the interpreter.

In the first lines of code, we imported the downloaded pip libraries and defined a "client" variable, which will reach our API via the provided API key.

Note: Lines starting with a hash sign (#) are seen as comments on Python, therefore the interpreter will ignore them. Keep an eye out for them, as they will provide information on what specific lines of code do.

# Importing libraries downloaded by pip
from scrapingbee import ScrapingBeeClient
import pandas as pd
# Initializing the ScrapingBee API client. Add your API key here
client = ScrapingBeeClient(api_key='YOUR_API_KEY')

As we continue defining the script, you will encounter configuration parameters for our Python SDK. If you need more help, check out our page containing the entire ScrapingBee Documentation.

Write and Run Your Scraper

Now we can start defining the function that will encompass the entire logic on how to scrape Google Flights. Before that, include your URL or a list of URLs which will be targeted by your scraper. Give your function a name, and the indented code beneath it will outline the steps in its execution:

# flight_urls variable list contains one or more URLs to scrape from
flight_urls = ["https://www.google.com/travel/flights/search?tfs=CBwQAhooEgoyMDI1LTA5LTI0agwIAxIIL20vMDE1NnFyDAgCEggvbS8wNGpwbBooEgoyMDI1LTA5LTI4agwIAhIIL20vMDRqcGxyDAgDEggvbS8wMTU2cUABSAFwAYIBCwj___________8BmAEB"]

# Start of the function definition
def scrape_google_flights(flight_urls):

Our API client can track multiple parameters that control web scraping and parsing within the connection request. Those configurations are usually presented in dictionary variables, as they allow us to assign values to specific key parameters.

Defining extract rules for Google Flights data

To only pick relevant information from the extracted HTML code, we must first define an "extract_rules" dictionary. It will focus on the list of Google Flights through appropriate selectors.

Let's take a closer look at "extract_rules" and which parts of the page it will parse:

 # The entire dictionary for parsing via CSS selectors
   extract_rules = {
        "flights": {
# selector defines from the blocks of HTML code from which the data will be extracted. In this case, its individual flight detail cards
            "selector": "li.pIav2d",
# Within the selector, other CSS selectors pick out raw data and structure it in a list for that particular flight
            "type": "list",
            "output": {
                "price": "div.BVAVmf > div.YMlIz",
                "time": "div.gvkrdb.AdWm1c",
                "airline": "div.Ir0Voe > div.sSHqwe > span:not([class])"
            }
        }
    }

Okay, this is a lot to unpack. Let's break down what these values do for our scraper:

  • "flights" – defines a key named "flights". Its value is another dictionary that describes how to extract flight details.

  • "selector": "li.pIav2d" – specifies the CSS selector (li.pIav2d) that identifies each flight card in the HTML.

  • "type": "list" – tells the parser to treat the selector results as a list, since there can be multiple flight cards.

  • "output": – starts a dictionary that defines what specific pieces of data to extract from each flight card.

  • "price": "div.BVAVmf > div.YMlIz" – extracts the flight’s price using the given CSS selector.

  • "time": "div.gvkrdb.AdWm1c" – selector accessing the estimated time of the journey.

  • "airline": "div.Ir0Voe > div.sSHqwe > span:not([class]) – extracts the airline name by targeting the span element without a class inside the given structure.

But how can you know if these parts will extract relevant Google Flights data? First, let's inspect the page manually to identify CSS selectors. Open a URL in your browser, right click a flight data card and select "inspect element".

CSS selector

Note: Google, as well as most big retailers online often change their webpage structure to stop existing scrapers from targeting consistent selectors. If your script stops working, check the page manually and update your script to target new CSS elements.

Parameters for JavaScript Rendering

Our HTML API operates with a built-in headless browser, defined by a "render_js" parameter, which is enabled by default. Meanwhile, the "js_scenario" configuration gives it specific instructions on how to behave and interact on the page before extacting data.

The following dictionary variable does exactly that. Here is a section which covers browser automation:

    js_scenario = {
        "instructions": [
            {"wait": 2000},
            {"evaluate": "window.scrollTo(0, document.body.scrollHeight);"},
            {"wait": 2000},
        ]
    }

Let's go over these commands and why they are necessary if you want to know how to scrape Google flights:

  • {"wait": 2000} – Tells the script to pause for 2000 ms (2 seconds). This is usually done to let the webpage load content.

  • {"evaluate": "window.scrollTo(0, document.body.scrollHeight);"} – Runs a JavaScript command that scrolls the browser window all the way down to the bottom of the page. This is often used to load more results (like infinite scrolling).

Configuring the GET API call

After defining rules for JavaScript automation, we can finally get to write the GET API call. The first line of code defines a list where all flight data from Google Flights will be stored, while the "response" parameter will store the extracted results from the connection.

 #creating an empty list to append with Google Flights data
all_flights = []
#A for loop to go through multiple URLS, invoking the flight_urls list from the start
    for url in flight_urls:
        response = client.get(
#takes each url element from the flight_urls list
            url,
            params={
                'custom_google': 'True',
                "extract_rules": extract_rules,
                "js_scenario": js_scenario,
            },
            retries=2
        )

By defining the GET API functionality in a for loop, we can control how many URLs to target and extract at the same time. After that, we have a dictionary of parameters:

  • 'custom_google': 'True' – A flag indicating that this is a custom Google scraping setup.

  • "extract_rules": extract_rules – Passes in our previously defined dictionary of parsing rules that tell the scraper what data to extract.

  • "js_scenario": js_scenario – assigns our js_scenario variable so the scraper can load all results before extraction.

Extracting Flights data

Now, all what is left is to extract details from the "response" variable, and extend the "all_reviews" list with extracted flight data.

 # stores data after extracting based on provided CSS selectors
       flights = response.json().get('flights', [])
        all_flights.extend(flights)
        print(f"{url}: {response.status_code}, {len(flights)} flights extracted")

The added print command will inform you about the results of your extraction for each specific URL. With "response.status_code, you can see the HTTP status code of your connection (200 if extraction is successful, 400 for poorly configured requests), and the amount of retrieved flight data cards.

Extracted

Parsing the HTML to Extract Data

When parsing HTML in Python, developers typically rely on additional libraries like lxml or BeautifulSoup (bs4). These tools allow you to load and extract elements by tags, attributes, or CSS selectors.

However, with our Google Flights API, these features are already included! You can parse and scrape Google flights data and navigate dynamic web pages by CSS targeting rules in the extract_rules dictionary. This way, you can reduce code complexity for web scraping and still retrieve reliable results – well structured flight data.

That being said, if you're not impressed with our parsing configuration options, check out our blog post on Python HTML Parsers.

Okay, back to the code. Instead of staring at the results in your Terminal, let's use the Pandas library to transform our flight deals list into a data frame. This way, you will get a categorized list of flight prices, departure airport code, duration of the trip, and other search parameters.

Then, using the Pandas' method .to_csv, we export collected flight listings into a separate csv file:

    df = pd.DataFrame(all_flights)
    df.to_csv("flights_data.csv", index=False)

And we're done! Close the function definition by not indenting the next line. The last step is to invoke the function: provide a singular URL, or a list of URLs based on your preferred arrival locations, return dates, departure times, and other parameters.

Here is a simple example of extracted pages from the Google Flights website. Make sure to revisit the URL manually to find more flight related data that you want to extract, and append the CSS selector in the "extract_rules" dictionary.

Result

Why Use ScrapingBee for Google Flights

Scraping Google Flights manually or with tools like Selenium takes more time and requires additional downloads. Our API makes it simple. The implementation of rotating proxies, CAPTCHA bypassing tools, and rendering JavaScript with headless Chrome is happening behind the scenes.

With just one Python package, you can extract structured flight data like airline names, prices, and travel times, all from live Google search result pages. It’s fast, reliable, and beginner-friendly.

Want to test it for yourself? Register for a ScrapingBee account to get your free trial with 1,000 API credits for a week to test our tools and build your custom Google Flights scraper!

Frequently Asked Questions (FAQs)

Can I get real-time flight prices with ScrapingBee?

Yes. our API fetches live data directly from Google Flights pages, including real-time prices, availability, and other dynamic content rendered with JavaScript.

How do I avoid being blocked when scraping Google Flights?

When you send a GET API call, parameters like automatic residential proxy rotation are enabled by default, preventing your IP from getting blocked. You don’t need to manage cookies, user agents, or solve captchas manually.

Can I extract structured data like departure times and airlines?

Yes. With our API's extract_rules dictionary, you can define CSS selectors to pull out exactly the elements you want, including times, airlines, and other relevant data

How much does it cost to scrape Google Flights with ScrapingBee?

On our platform, your account starts with a 1,000-credit free trial trial. After that, pricing is based on the number of API requests, ensuring affordable scaling. For more details, check out our pricing page.

image description
Kevin Sahin

Kevin worked in the web scraping industry for 10 years before co-founding ScrapingBee. He is also the author of the Java Web Scraping Handbook.