米扑代理,全球领导的代理品牌,专注代理行业近十年,提供开放、私密、独享代理,并可免费试用

米扑代理官网:https://proxy.mimvp.com

 

本文示例,是结合米扑代理的私密、独享、开放代理,专门研发的示例,

支持 http、https的无密码、白名单ip、密码授权三种类型

 

示例中,用的插件 xpi 请到米扑代理官网,或米扑官方 github 下载

本文,直接给出完整的代码,都经过严格验证通过,具体请见注释

 

本文示例的运行环境:

MacBook Pro  MacOS High Sierra Version 10.13.4

Google Chrome  Version 63.0.3239.84 (Official Build) (64-bit)

Python 2.7.13 (v2.7.13:a06454b1afa1, Dec 17 2016, 12:39:47) 

$ pip list | grep selenium
selenium (3.4.2)

 

chromedriver 下载地址:http://chromedriver.storage.googleapis.com/index.html

 

Python + Selenium + Chrome

出错提示:WebDriverException: 'chromedriver' executable needs to be in PATH

解决方法:

a. 下载 ChromeDriver,其它浏览器参见官网说明

b. 复制 chromedrive 文件到 Google Chrome 程序目录下,或复制到环境变量下

cp chromedrive /usr/local/bin/

各操作系统里的位置路径可以参考官方Wiki

Python 代码里创建 webdriver 对象时传递 chromedrive 路径

示例1:MacOS + chrome 环境

chromedriver = "/Applications/Google Chrome.app/Contents/MacOS/chromedriver"

browser = webdriver.Chrome(executable_path=chromedriver)        # 打开 Chrome 浏览器

browser.get(url)    

content = browser.page_source

print("content: " + str(content))

 

示例2:MacOS + 环境变量

def spider_url_chrome(url):

    browser = None

    display = None

    try:

        display = Display(visible=0, size=(800, 600))

        display.start()

        chromedriver = '/usr/local/bin/chromedriver'

        browser = webdriver.Chrome(executable_path=chromedriver)        # 打开 Chrome 浏览器

        browser.get(url)    

        content = browser.page_source

        print("content: " + str(content))

    finally:

        if browser: browser.quit()

        if display: display.stop()

 

Selenium + chromedriver 代理使用,无密码或已设置白名单ip

## webdriver + chrome + proxy + whiteip (无密码,或白名单ip授权)

## 米扑代理:https://proxy.mimvp.com

def spider_url_chrome_by_whiteip(url):

    browser = None

    display = None

     

    ## 白名单ip,请见米扑代理会员中心: https://proxy.mimvp.com/usercenter/userinfo.php?p=whiteip

    mimvp_proxy = {

                    'ip'            : '140.143.62.84',      # ip

                    'port_https'    : 62288,                # http, https

                    'port_socks'    : 62287,                # socks5

                    'username'      : 'mimvp-user',

                    'password'      : 'mimvp-pass'

                  }

     

    try:

        display = Display(visible=0, size=(800, 600))

        display.start()

         

        chrome_options = Options()                      # ok

        chrome_options = webdriver.ChromeOptions()      # ok

        proxy_https_argument = '--proxy-server=http://{ip}:{port}'.format(ip=mimvp_proxy['ip'], port=mimvp_proxy['port_https'])     # http, https (无密码,或白名单ip授权,成功)

        chrome_options.add_argument(proxy_https_argument)

#         proxy_socks_argument = '--proxy-server=socks5://{ip}:{port}'.format(ip=mimvp_proxy['ip'], port=mimvp_proxy['port_socks'])   # socks5 (无密码,或白名单ip授权,失败)

#         chrome_options.add_argument(proxy_socks_argument)

         

        chromedriver = '/usr/local/bin/chromedriver'

        browser = webdriver.Chrome(executable_path=chromedriver, chrome_options=chrome_options)        # 打开 Chrome 浏览器

        browser.get(url)    

        content = browser.page_source

        print("content: " + str(content))

    finally:

        if browser: browser.quit()

        if display: display.stop()

 

 

Selenium + chromedriver 代理使用,支持http、https账号密码

本示例,采用了米扑代理的用户名密码授权

获取户名密码授权,请到米扑代理 - 会员中心 - 白名单ip

1、创建一个zip包,包含以下两个文件 background.js 和 manifest.json,打包成 proxy.zip

1)background.js

var config = {

    mode: "fixed_servers",

    rules: {

      singleProxy: {

        scheme: "http",

        host: "140.143.62.84",

        port: 19480

      },

      bypassList: ["mimvp.com"]

    }

  };



chrome.proxy.settings.set({value: config, scope: "regular"}, function() {});



function callbackFn(details) {

    return {

        authCredentials: {

            username: "mimvp-user",

            password: "mimvp-pass"

        }

    };

}



chrome.webRequest.onAuthRequired.addListener(

        callbackFn,

        {urls: ["<all_urls>"]},

        ['blocking']

);</all_urls>

注意:上面配置中,需要把代理ip、port、username、password 替换成米扑代理的ip:port、授权用户名和密码

 

2)manifest.json

{"version": "1.0.0",

    "manifest_version": 2,

    "name": "Chrome Proxy",

    "permissions": [

        "proxy",

        "tabs",

        "unlimitedStorage",

        "storage",

        "<all_urls>",

        "webRequest",

        "webRequestBlocking"

    ],

    "background": {

        "scripts": ["background.js"]

    },

    "minimum_chrome_version":"22.0.0"

}

说明:上面配置,不需要改动,直接拷贝使用即可

 

2、添加 proxy.zip 到 chrome 中作为插件

#!/usr/bin/env python

# -*- coding:utf-8 -*-



from selenium import webdriver

from selenium.webdriver.common.proxy import *

from selenium.webdriver.chrome.options import Options

from pyvirtualdisplay import Display

# from xvfbwrapper import Xvfb





def spider_url_chrome_by_https(url):

    browser = None

    display = None

    try:

        display = Display(visible=0, size=(800, 600))

        display.start()

         

        chrome_options = Options()

        chrome_options.add_extension("proxy.zip")

         

        chromedriver = '/usr/local/bin/chromedriver'

        browser = webdriver.Chrome(executable_path=chromedriver, chrome_options=chrome_options)        # 打开 Chrome 浏览器

        browser.get(url)    

        content = browser.page_source

        print("content: " + str(content))

    finally:

        if browser: browser.quit()

        if display: display.stop()





if __name__ == '__main__':

    url = 'https://ip.cn'

    url = 'https://mimvp.com/'

    url = 'https://proxy.mimvp.com/ip.php'



    # http, https 密码授权,成功

    spider_url_chrome_by_https(url)



3、运行效果,验证成功

content: <html xmlns="http://www.w3.org/1999/xhtml"><head></head><body>140.143.62.84</body></html>

 

转载链接: https://blog.csdn.net/ithomer/article/details/81091337

Logo

旨在为数千万中国开发者提供一个无缝且高效的云端环境,以支持学习、使用和贡献开源项目。

更多推荐