selenium 设置代理 和 设置认证代理
米扑代理,全球领导的代理品牌,专注代理行业近十年,提供开放、私密、独享代理,并可免费试用
米扑代理官网:https://proxy.mimvp.com
本文示例,是结合米扑代理的私密、独享、开放代理,专门研发的示例,
支持 http、https的无密码、白名单ip、密码授权三种类型
示例中,用的插件 xpi 请到米扑代理官网,或米扑官方 github 下载
本文,直接给出完整的代码,都经过严格验证通过,具体请见注释
本文示例的运行环境:
MacBook Pro MacOS High Sierra Version 10.13.4
Google Chrome Version 63.0.3239.84 (Official Build) (64-bit)
Python 2.7.13 (v2.7.13:a06454b1afa1, Dec 17 2016, 12:39:47)
$ pip list | grep selenium
selenium (3.4.2)
chromedriver 下载地址:http://chromedriver.storage.googleapis.com/index.html
Python + Selenium + Chrome
出错提示:WebDriverException: 'chromedriver' executable needs to be in PATH
解决方法:
a. 下载 ChromeDriver,其它浏览器参见官网说明
b. 复制 chromedrive 文件到 Google Chrome 程序目录下,或复制到环境变量下
cp chromedrive /usr/local/bin/
各操作系统里的位置路径可以参考官方Wiki
Python 代码里创建 webdriver 对象时传递 chromedrive 路径
示例1:MacOS + chrome 环境
chromedriver = "/Applications/Google Chrome.app/Contents/MacOS/chromedriver"
browser = webdriver.Chrome(executable_path=chromedriver) # 打开 Chrome 浏览器
browser.get(url)
content = browser.page_source
print("content: " + str(content))
示例2:MacOS + 环境变量
def spider_url_chrome(url):
browser = None
display = None
try:
display = Display(visible=0, size=(800, 600))
display.start()
chromedriver = '/usr/local/bin/chromedriver'
browser = webdriver.Chrome(executable_path=chromedriver) # 打开 Chrome 浏览器
browser.get(url)
content = browser.page_source
print("content: " + str(content))
finally:
if browser: browser.quit()
if display: display.stop()
Selenium + chromedriver 代理使用,无密码或已设置白名单ip
## webdriver + chrome + proxy + whiteip (无密码,或白名单ip授权)
## 米扑代理:https://proxy.mimvp.com
def spider_url_chrome_by_whiteip(url):
browser = None
display = None
## 白名单ip,请见米扑代理会员中心: https://proxy.mimvp.com/usercenter/userinfo.php?p=whiteip
mimvp_proxy = {
'ip' : '140.143.62.84', # ip
'port_https' : 62288, # http, https
'port_socks' : 62287, # socks5
'username' : 'mimvp-user',
'password' : 'mimvp-pass'
}
try:
display = Display(visible=0, size=(800, 600))
display.start()
chrome_options = Options() # ok
chrome_options = webdriver.ChromeOptions() # ok
proxy_https_argument = '--proxy-server=http://{ip}:{port}'.format(ip=mimvp_proxy['ip'], port=mimvp_proxy['port_https']) # http, https (无密码,或白名单ip授权,成功)
chrome_options.add_argument(proxy_https_argument)
# proxy_socks_argument = '--proxy-server=socks5://{ip}:{port}'.format(ip=mimvp_proxy['ip'], port=mimvp_proxy['port_socks']) # socks5 (无密码,或白名单ip授权,失败)
# chrome_options.add_argument(proxy_socks_argument)
chromedriver = '/usr/local/bin/chromedriver'
browser = webdriver.Chrome(executable_path=chromedriver, chrome_options=chrome_options) # 打开 Chrome 浏览器
browser.get(url)
content = browser.page_source
print("content: " + str(content))
finally:
if browser: browser.quit()
if display: display.stop()
Selenium + chromedriver 代理使用,支持http、https账号密码
本示例,采用了米扑代理的用户名密码授权
获取户名密码授权,请到米扑代理 - 会员中心 - 白名单ip
1、创建一个zip包,包含以下两个文件 background.js 和 manifest.json,打包成 proxy.zip
1)background.js
var config = {
mode: "fixed_servers",
rules: {
singleProxy: {
scheme: "http",
host: "140.143.62.84",
port: 19480
},
bypassList: ["mimvp.com"]
}
};
chrome.proxy.settings.set({value: config, scope: "regular"}, function() {});
function callbackFn(details) {
return {
authCredentials: {
username: "mimvp-user",
password: "mimvp-pass"
}
};
}
chrome.webRequest.onAuthRequired.addListener(
callbackFn,
{urls: ["<all_urls>"]},
['blocking']
);</all_urls>
注意:上面配置中,需要把代理ip、port、username、password 替换成米扑代理的ip:port、授权用户名和密码
2)manifest.json
{"version": "1.0.0",
"manifest_version": 2,
"name": "Chrome Proxy",
"permissions": [
"proxy",
"tabs",
"unlimitedStorage",
"storage",
"<all_urls>",
"webRequest",
"webRequestBlocking"
],
"background": {
"scripts": ["background.js"]
},
"minimum_chrome_version":"22.0.0"
}
说明:上面配置,不需要改动,直接拷贝使用即可
2、添加 proxy.zip 到 chrome 中作为插件
#!/usr/bin/env python
# -*- coding:utf-8 -*-
from selenium import webdriver
from selenium.webdriver.common.proxy import *
from selenium.webdriver.chrome.options import Options
from pyvirtualdisplay import Display
# from xvfbwrapper import Xvfb
def spider_url_chrome_by_https(url):
browser = None
display = None
try:
display = Display(visible=0, size=(800, 600))
display.start()
chrome_options = Options()
chrome_options.add_extension("proxy.zip")
chromedriver = '/usr/local/bin/chromedriver'
browser = webdriver.Chrome(executable_path=chromedriver, chrome_options=chrome_options) # 打开 Chrome 浏览器
browser.get(url)
content = browser.page_source
print("content: " + str(content))
finally:
if browser: browser.quit()
if display: display.stop()
if __name__ == '__main__':
url = 'https://ip.cn'
url = 'https://mimvp.com/'
url = 'https://proxy.mimvp.com/ip.php'
# http, https 密码授权,成功
spider_url_chrome_by_https(url)
3、运行效果,验证成功
content: <html xmlns="http://www.w3.org/1999/xhtml"><head></head><body>140.143.62.84</body></html>
转载链接: https://blog.csdn.net/ithomer/article/details/81091337
更多推荐
所有评论(0)