python爬⾍⼩说代码,可⽤的
python爬⾍⼩说代码,可⽤的,以笔趣阁为例⼦,python3.6以上,可⽤作者的QQ:342290433,汉唐⾃远⼯程师import requests
import re
from lxml import etree
url = "www.biquga/33_33132/16700250.html"
def get_content(url):
nodes = '';
html_doc = (url).content.decode('gbk')
# ⽹站地址编码
tree = etree.HTML(html_doc)
# www.shuangxiniao下⼀章地址
url = tree.xpath('//*[@id="wrapper"]/div[4]/div/div[4]/a[4]//@href')[0]
url = 'www.biquga/' + url
# www.hiry章节标题
node_title = tree.xpath('//*[@id="wrapper"]/div[4]/div/div[2]/h1//text()')[0]
# www.qijihu⼩说内容
python新手代码例子node_content = tree.xpath('//*[@id="content"]//text()')
nodes += node_title
nodes += '\n\n'
for node in node_content:
node = node.strip('\r')
nodes += node
nodes += '\n\n'
print(node_title)
filename = './全职妙⼿.txt'
with open(filename,'a+', encoding='utf-8') as f:
f.write(nodes)
if re.search('.html', url) != None:
get_content(url)
get_content(url)

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。