python爬⾍之获取页⾯script⾥⾯的内容这是⽹页上的script 我要获取的是00914这个数字直接使⽤正则表达式即可
运⾏结果:
源码:
import re
from bs4 import BeautifulSoup
正则表达式获取括号内容quest import urlopen
url = "你要解析的⽹页URL"
html = urlopen(url).read()
soup = BeautifulSoup(html,"html.parser")
titles = soup.select("body  script") # CSS 选择器
i = 1
for title in titles:
if i == 3:
#_text())# 标签体、标签属性
_text()
break
if i == 2:
i = 3
if i == 1:
i = 2
print(str)
str1 = "\"\"\""+"<script>"+str+"</script>"+"\"\"\""
soup = BeautifulSoup(str1, "html.parser")
pattern = repile(r"var _url = '(.*?)';$", re.MULTILINE | re.DOTALL)
script = soup.find("script", text=pattern) #print (pattern.).string) s = pattern.).string print (s.split('\'')[11])

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。