Can I use XPath selectors in BeautifulSoup?
What is XPath?
XPath is an expression language designed to support the query or transformation of XML documents. It was defined by the W3C and can be used to navigate through elements and attributes in an XML document.
Can we use XPath with BeautifulSoup?
Technically, no. But we can BeautifulSoup4 with lxml Python library to achieve that.
To install lxml, all you have to do is run this command:
pip install lxml, and that's it!
And we can now run this code to extract ScrapingBee's blog title:
import requests from bs4 import BeautifulSoup from lxml import etree response = requests.get("https://www.scrapingbee.com/blog/") soup = BeautifulSoup(response.content, 'html.parser') body = soup.find("body") dom = etree.HTML(str(body)) # Parse the HTML content of the page xpath_str = '//*[@id="content"]/section/div/div/h1' # The XPath expression for the blog's title print(dom.xpath(xpath_str).text)