python爬取网页数据代码

作者：青灯寂焚发布日期:2026-01-31 浏览:76

import requests
from bs4 import BeautifulSoup

# 定义一个函数来爬取网页数据
def fetch_webpage_data(url):
    # 发送HTTP请求获取网页内容
    response = requests.get(url)

    # 检查请求是否成功
    if response.status_code == 200:
        # 使用BeautifulSoup解析HTML内容
        soup = BeautifulSoup(response.content, 'html.parser')

        # 示例：提取所有的标题标签<h1>
        titles = soup.find_all('h1')

        # 打印所有标题文本
        for title in titles:
            print(title.get_text())
    else:
        print("请求失败，状态码:", response.status_code)

# 示例URL
url = "https://example.com"

# 调用函数并传入URL
fetch_webpage_data(url)

解释说明：

导入库：
- requests：用于发送HTTP请求。
- BeautifulSoup：用于解析HTML内容。
定义函数 fetch_webpage_data：
- 接受一个URL作为参数。
- 使用requests.get()发送HTTP GET请求，获取网页内容。
- 检查响应状态码是否为200（表示请求成功）。
- 使用BeautifulSoup解析HTML内容。
- 提取所有的<h1>标签，并打印其文本内容。
示例URL：
- url = "https://example.com" 是一个示例URL，你可以替换为你想要爬取的网页地址。
调用函数：
- 最后调用fetch_webpage_data(url)来执行爬取操作。

这个代码示例展示了如何使用Python爬取网页数据并解析其中的内容。

上一篇：python中的int函数

下一篇：python filenotfounderror