爬虫python入门教程

作者：飞吻狂魔发布日期:2026-01-25 浏览:162

# 简单的 Python 爬虫入门示例代码

import requests
from bs4 import BeautifulSoup

# 发送 HTTP 请求
url = 'https://example.com'
response = requests.get(url)

# 检查请求是否成功
if response.status_code == 200:
    # 使用BeautifulSoup解析HTML内容
    soup = BeautifulSoup(response.text, 'html.parser')

    # 找到所有的标题标签<h1>
    titles = soup.find_all('h1')

    # 打印所有标题
    for title in titles:
        print(title.get_text())
else:
    print("请求失败，状态码:", response.status_code)

解释说明：

导入库：我们使用 requests 库来发送 HTTP 请求，并使用 BeautifulSoup 来解析 HTML 内容。
发送请求：通过 requests.get(url) 向目标网站发送 GET 请求，并获取响应。
检查状态码：通过 response.status_code 检查请求是否成功（状态码为 200 表示成功）。
解析 HTML：使用 BeautifulSoup 解析响应的 HTML 内容，并提取所有 <h1> 标签的内容。
输出结果：将提取到的标题打印出来。

这个简单的例子展示了如何用 Python 编写一个基本的爬虫程序。

上一篇：python程序

下一篇：python多线程