python下scarpy爬虫代理错误407

#python下scarpy爬虫代理错误407| 来源: 网络整理| 查看: 265

我们在数据采集过程中程序会经常返回一些状态码，响应HTTP请求会返回响应状态码，根据状态码可以知道返回信息代表什么意思。今天我们来分享下关于407。一般爬虫程序是不会返回407的，一般出现407是在我们程序挂了代理以后，比如我们的爬虫程序中添加了由亿牛云提供的爬虫隧道代理。但是运行后程序就报了407错误。

#! -*- encoding:utf-8 -*- import base64 import sys import random PY3 = sys.version_info[0] >= 3 def base64ify(bytes_or_str): if PY3 and isinstance(bytes_or_str, str): input_bytes = bytes_or_str.encode('utf8') else: input_bytes = bytes_or_str output_bytes = base64.urlsafe_b64encode(input_bytes) if PY3: return output_bytes.decode('ascii') else: return output_bytes class ProxyMiddleware(object): def process_request(self, request, spider): # 代理服务器(产品官网 www.16yun.cn) proxyHost = "t.16yun.cn" proxyPort = "31111" # 代理验证信息 proxyUser = "username" proxyPass = "password" # [版本>=2.6.2](https://docs.scrapy.org/en/latest/news.html?highlight=2.6.2#scrapy-2-6-2-2022-07-25)无需添加验证头,会自动在请求头中设置Proxy-Authorization requesta['proxy'] = "http://{0}:{1}@{2}:{3}".format(proxyUser,proxyPass，proxyHost，proxyPort) # 版本

【本文地址】

公司简介

联系我们

今日新闻

推荐新闻

专题文章