How to turn HTML to text in Python?

You can easily extract text from an HTML page using any of the famous HTML parsing libraries in Python. Here is an example of extracting text using BeautifulSoup's get_text() method:

from bs4 import BeautifulSoup

soup = BeautifulSoup("""
<body>
    <h1 class="product">Product Details</h1>
    <div class="details">
        <div>Remaining Stock</div>
        <div>5</div>
    </div>
</body>
""")

body = soup.find('body')
body_text = body.get_text()
print(body_text)

It will produce the following output:


Product Details

Remaining Stock
5

Selenium also offers something similar. You can use the .text property of an HTMLElement to extract text from it.

Related Data Parsing web scraping questions: