65 Star 359 Fork 187

耿直的小爬虫/Python爬虫

Create your Gitee Account
Explore and code with more than 12 million developers,Free private repositories !:)
Sign up
文件
This repository doesn't specify license. Please pay attention to the specific project description and its upstream code dependency when using it.
Clone or Download
模拟登陆(保存cookies) 975 Bytes
Copy Edit Raw Blame History
import urllib.request as r
import urllib.parse as p
import http.cookiejar as c
url='http://bbs.chinaunix.net/member.php?mod=logging&action=login&loginsubmit=yes&loginhash=LxogS'
postdata=p.urlencode({
#username password自己加上 没有在这个网站上注册的话 就先去注册一个
'username':' ',
'password':' '
}).encode('utf-8')
req=r.Request(url,postdata)
req.add_header('User-Agent','Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 SE 2.X MetaSr 1.0')
cjar=c.CookieJar()
opener=r.build_opener(r.HTTPCookieProcessor(cjar))
r.install_opener(opener)
file=opener.open(req)
data=file.read()
with open('1.html','wb')as p:
print('data是什么类型的:',type(data))
p.write(data)
print('1.html爬取完毕')
url2='http://bbs.chinaunix.net/'
data2=r.urlopen(url2).read()
with open('2.html','wb')as b:
b.write(data2)
print('2.html爬取完毕')
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
Python
1
https://gitee.com/testp2y/python_reptilian.git
git@gitee.com:testp2y/python_reptilian.git
testp2y
python_reptilian
Python爬虫
master

Search

0d507c66 1850385 C8b1a773 1850385