How to extract data from website using selenium python?

 

You can use Selenium to scrape data from specific elements of a web page. Let's take the same example from our previous post: How to web scrape with python selenium?

We have used this Python code (with Selenium) to wait for the content to load by adding some waiting time:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time

options = Options()
options.headless = True

driver = webdriver.Chrome(options=options, executable_path="PATH_TO_CHROMEDRIVER") # Setting up the Chrome driver
driver.get("https://demo.scrapingbee.com/content_loads_after_5s.html")
time.sleep(6) # Sleep for 6 seconds
print(driver.page_source)
driver.quit()

And we've had this result:

<!DOCTYPE html>
<html>
...

<div id="content">This is content</div>

...
</html>​​​​​

Now, we can further improve our code to extract the content itself without having to load the whole HTML code. To do that, we can run this code:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
import time

options = Options()
options.headless = True

driver = webdriver.Chrome(options=options, executable_path="PATH_TO_CHROMEDRIVER") # Setting up the Chrome driver
driver.get("https://demo.scrapingbee.com/content_loads_after_5s.html")
time.sleep(6) # Sleep for 6 seconds
element = driver.find_element(By.ID, 'content')
print(element.text)
driver.quit()

And the result will be: This is content instead of the page's HTML code.

For more information about Python & Selenium, make sure to check this thorough blog article: Web Scraping using Selenium and Python

Related Python web scraping questions: