Revision history [back]

There are several ways to retrieve website links using Python. Here are two common methods:

Using the requests and BeautifulSoup libraries:

import requests
from bs4 import BeautifulSoup

url = 'https://www.example.com'
html = requests.get(url).text
soup = BeautifulSoup(html, 'html.parser')

# Find all links on the page
links = []
for link in soup.find_all('a'):
    links.append(link.get('href'))

print(links)

Using the Selenium library:

from selenium import webdriver

url = 'https://www.example.com'
driver = webdriver.Chrome('path/to/chromedriver.exe')
driver.get(url)

# Find all links on the page
links = []
for link in driver.find_elements_by_tag_name('a'):
    links.append(link.get_attribute('href'))

print(links)

driver.quit()

Note that using the requests and BeautifulSoup method requires less setup and browser overhead compared to the Selenium method, but is limited to static web pages. If you need to scrape dynamic web pages that involve user interaction or require JavaScript rendering, Selenium is a better choice.