将bs4.element.ResultSet类型转换为bs4.BeautifulSoup类型
element
A Vue.js 2.0 UI Toolkit for Web
项目地址:https://gitcode.com/gh_mirrors/eleme/element
免费下载资源
·
首先我们查看一下request库的返回值类型,这样就知道BeautifulSoup构造方法需要什么类型的参数了:
request返回值类型: <class 'str'>
我们发现,request库的返回值类型是String,也就是说,我们可以先把bs4.element.ResultSet类型转换为String,之后再用BeautifulSoup构造方法将String类型转换为BeautifulSoup,这样就可以继续用find_All()方法,代码如下:
data = getHtmlText(url=url) # 这里返回值其实是request.text
print('request返回值类型:',type(data))
soup = BeautifulSoup(data, "html.parser")
print('BeautifulSoup类型:',type(soup))
page = soup.find_all('div',class_='more-page')
data2 = str(page)
soup2 = BeautifulSoup(data2, "html.parser")
page_count = soup2.script.string
# print(page_count)
getHtmlText方法代码如下:
def getHtmlText(url):
headers = {
'Accept': '*/*',
'Accept-Encoding': 'gzip, deflate',
'Accept-Language': 'zh-CN,zh;q=0.9',
'Connection': 'keep-alive',
'Cookie': 'widget_dz_id=54511; widget_dz_cityValues=,; timeerror=1; defaultCityID=54511; defaultCityName=%u5317%u4EAC; Hm_lvt_a3f2879f6b3620a363bec646b7a8bcdd=1516245199; Hm_lpvt_a3f2879f6b3620a363bec646b7a8bcdd=1516245199; addFavorite=clicked',
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3236.0 Safari/537.36'
}
try:
r = requests.get(url, timeout=30, headers=headers)
r.raise_for_status() #如果状态不是200,引发HTTPError异常(200表示能正常访问url)
r.encoding = r.apparent_encoding
return r.text # 获取数据
except:
return "产生异常"
GitHub 加速计划 / eleme / element
54.06 K
14.63 K
下载
A Vue.js 2.0 UI Toolkit for Web
最近提交(Master分支:3 个月前 )
c345bb45
7 个月前
a07f3a59
* Update transition.md
* Update table.md
* Update transition.md
* Update table.md
* Update transition.md
* Update table.md
* Update table.md
* Update transition.md
* Update popover.md 7 个月前
更多推荐
已为社区贡献5条内容
所有评论(0)