What does Beautifulsoup do in Python?
BeautifulSoup parses the HTML allowing you to extract information from it.
When doing web scraping, you will usually not be interested in the HTML on the page, but in the underlying data. This is where BeautifulSoup comes into play.
BeautifulSoup will take that HTML and turn it into the data you're interested in. Here is a quick example on how to extract the title of a webpage:
import requests from bs4 import BeautifulSoup response = requests.get("https://news.ycombinator.com/") soup = BeautifulSoup(response.content, 'html.parser') # The title tag of the page print(soup.title) > <title>Hacker News</title> # The title of the page as string print(soup.title.string) > Hacker News
If you want to learn more about BeautifulSoup and how to extract links, custom attributes, siblings and more, feel free to check our BeautifulSoup tutorial.Go back to web scraping questions