Revision history [back]

There are different ways to view the results of web scraping using BeautifulSoup. Here are some options:

Print the parsed HTML using prettify() method: This will print the entire HTML document with indents, making it easier to read.

from bs4 import BeautifulSoup
import requests

# make a request to the website
url = 'https://www.example.com'
response = requests.get(url)

# parse the HTML using BeautifulSoup
soup = BeautifulSoup(response.text, 'html.parser')

# print the parsed HTML
print(soup.prettify())

Select and print specific elements using tag names, class names, or IDs: This will print only the specified elements and their content.

from bs4 import BeautifulSoup
import requests

# make a request to the website
url = 'https://www.example.com'
response = requests.get(url)

# parse the HTML using BeautifulSoup
soup = BeautifulSoup(response.text, 'html.parser')

# find a specific tag and print its content
header = soup.find('h1')
print(header.text)

# find all tags with a specific class and print their content
paragraphs = soup.find_all('p', {'class': 'intro'})
for p in paragraphs:
    print(p.text)

# find an element by ID and print its content
logo = soup.find('img', {'id': 'logo'})
print(logo['src'])

Save the parsed data to a file: This will allow you to view the data even if you close your Python script.

from bs4 import BeautifulSoup
import requests

# make a request to the website
url = 'https://www.example.com'
response = requests.get(url)

# parse the HTML using BeautifulSoup
soup = BeautifulSoup(response.text, 'html.parser')

# save the parsed data to a file
with open('output.html', 'w') as file:
    file.write(soup.prettify())

Once you have the parsed data, you can also use it to extract specific information, clean it, and analyze it, depending on your needs.