1 Star 9 Fork 3

leon/AutoHomeSpider

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
AutohomeList.py 1.45 KB
一键复制 编辑 原始数据 按行查看 历史
leon 提交于 2021-08-13 07:10 . 初始提交版本
import requests
from lxml import html
#先获取汽车大类的链接保存下来
class GetList(object):
def __init__(self):
self.headers={
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36"
}
self.base_url = "https://car.autohome.com.cn"
self.url = "https://car.autohome.com.cn/diandongche/index.html"
#获取主页的侧边汽车大类链接
def get_urls(self):
response=requests.get(self.url,headers=self.headers).text
etree=html.etree
res=etree.HTML(response)
carlist_href = res.xpath("//div[@id='cartree']/ul/li/h3/a/@href") #xpath获取车列表
carlist_href=[self.base_url+ch_url for ch_url in carlist_href]
return carlist_href
#获取大类中的各个车型链接并保存到txt文件中
def get_attribute(self):
carlist=self.get_urls()
etree=html.etree
file=open(r"./data/CarHref.txt","w")
i=0
while i<len(carlist):
url=carlist[i]
response = requests.get(url, headers=self.headers).text
res=etree.HTML(response)
chlist_href=res.xpath("//*[contains(@id,'series_')]/@href")
for ele in chlist_href:
print(" Ev Car ID url: "+ele)
file.write(ele[ele.find('-')+1:len(ele)-5])
file.write("\n")
i=i+1
file.close()
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
Python
1
https://gitee.com/leon_young/auto-home-spider.git
git@gitee.com:leon_young/auto-home-spider.git
leon_young
auto-home-spider
AutoHomeSpider
master

搜索帮助