pythonjava爬⾍_java爬⾍与python爬⾍对⽐
java爬⾍与python爬⾍的对⽐:
python做爬⾍语法更简单,代码更简洁。java的语法⽐python严格,⽽且代码也更复杂
⽰例如下:
url请求:
java版的代码如下:public String call (String url){
String content = "";
BufferedReader in = null;
try{
URL realUrl = new URL(url);
URLConnection connection = realUrl.openConnection();
in = new BufferedReader(new InputStream(),"gbk"));
String line ;
while ((line = in.readLine()) != null){
content += line + "\n";
}
}catch (Exception e){
e.printStackTrace();
}
finally{
try{
if (in != null){
in.close();
}
}catch(Exception e2){
e2.printStackTrace();
}
}
return content;
}
java调用python模型python版的代码如下:# coding=utf-8
import chardet
import urllib2
url = "www.baidu"
data = (urllib2.urlopen(url)).read()
charset = chardet.detect(data)
code = charset['encoding']
content = str(data).decode(code, 'ignore').encode('utf8')
print content
正则表达式
java版的代码如下:public String call(String content) throws Exception { Pattern p = Patternpile("content\":\".*?\"");
Matcher match = p.matcher(content);
StringBuilder sb = new StringBuilder();
String tmp;
while (match.find()){
tmp = up();
tmp = placeAll("\"", "");
tmp = place("content:", "");
tmp = placeAll("<.>", "");
sb.append(tmp + "\n");
}
String comment = sb.toString();
return comment;
}
}
python的代码如下:import repattern = repile(正则)
group = pattern.findall(字符串)
更多Python知识,请关注:Python⾃学⽹!!
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论