...元素定位方式Xpath总结、python操作json和。。。--688IT编程网

正则表达式re.findall⽤法、元素定位⽅式Xpath总结、python

操作json和。。。

1.正则 re.findall 的简单⽤法

2.xpath定位总结

3.python操作json和csv

⽬录

1、正则 re.findall 的简单⽤法

正则 re.findall 的简单⽤法（返回string中所有与pattern相匹配的全部字串，返回形式为数组）

语法：

1findall(pattern, string, flags=0)

import re

Python 正则表达式 re findall ⽅法能够以列表的形式返回能匹配的⼦串

# print (help(re.findall))

# print (dir(re.findall))

findall查全部r标识代表后⾯是正则的语句

1 2 3regular_v1 =re.findall(r"docs","/3/whatsnew/3.6.html") print(regular_v1)

# ['docs']

符号^表⽰匹配以https开头的的字符串返回,

1 2 3regular_v2 =re.findall(r"^https","/3/whatsnew/3.6.html") print(regular_v2)

# ['https']

⽤$符号表⽰以html结尾的字符串返回，判断是否字符串结束的字符串

1 2 3regular_v3 =re.findall(r"html$","/3/whatsnew/3.6.html") print(regular_v3)

# ['html']

# [...]匹配括号中的其中⼀个字符

1 2 3regular_v4 =re.findall(r"[t,w]h","/3/whatsnew/3.6.html") print(regular_v4)

# ['th', 'wh']

“d”是正则语法规则⽤来匹配0到9之间的数返回列表

1 2 3 4 5 6regular_v5 =re.findall(r"\d","/3/whatsnew/3.6.html") regular_v6 =re.findall(r"\d\d\d","/3/whatsnew/3.6.html/1234") print(regular_v5)

unseasonably是什么意思# ['3', '3', '6']

print(regular_v6)

# ['123']

⼩d表⽰取数字0-9，⼤D表⽰不要数字，也就是出了数字以外的内容返回

1 2 3regular_v7 =re.findall(r"\D","/3/whatsnew/3.6.html")

print(regular_v7)

# ['h', 't', 't', 'p', 's', ':', '/', '/', 'd', 'o', 'c', 's', '.', 'p', 'y', 't', 'h', 'o', 'n', '.', 'o', 'r', 'g', '/', '/', 'w', 'h', 'a', 't', 's', 'n', 'e', 'w', '/', '.', '.', 'h', 't', 'm', 'l']

“w”在正则⾥⾯代表匹配从⼩写a到z,⼤写A到Z，数字0到9

1 2 3regular_v8 =re.findall(r"\w","/3/whatsnew/3.6.html")

print(regular_v8)

#['h', 't', 't', 'p', 's', 'd', 'o', 'c', 's', 'p', 'y', 't', 'h', 'o', 'n', 'o', 'r', 'g', '3', 'w', 'h', 'a', 't', 's', 'n', 'e', 'w', '3', '6', 'h', 't', 'm', 'l']

“W”在正则⾥⾯代表匹配除了字母与数字以外的特殊符号

1 2 3regular_v9 =re.findall(r"\W","/3/whatsnew/3.6.html") print(regular_v9)

# [':', '/', '/', '.', '.', '/', '/', '/', '.', '.']

2.xpath定位总结

⼀.绝对路径（不要使⽤，除⾮已经使⽤了所有⽅式仍然⽆法定位）

⽅法：根据实际⽬录，逐层输写。

例⼦： find_element_by_xpath("/html/body/div[2]/form/span/input") #div[2]指第2个元素

⼆.相对路径（建议使⽤）

⽅法:⾸先⽬录元素是否有”精准元素“即唯⼀能标识的属性，到，则⽤此属性定位；

1. 通过元素本⾝的唯⼀属性定位

⽅法：到⽬标元素所在的”精准元素“即唯⼀标识属性，使⽤此属性定位

1.1 通过id属性定位

例：find_element_by_xpath("//input[@id='input']") #@后跟属性，可以是任何属性

1.2 通过name属性定位

例：find_element_by_xpath("//div[@name='q']")

2. 通过上⼀级⽬录的唯⼀属性定位

⽅法：⽬标元素没有唯⼀属性，则去到与⽬标元素相近的上级⽬录中”唯⼀元素“作为起始位置，然后根据此相对位置逐层往⼦⽬录编写到⽬标位置

python解析json文件例： find_element_by_xpath("//span[@id='input-container']/input")

find_element_by_xpath("//div[@id='hd']/form/span/input")

find_element_by_xpath("//div[@name='q']/form/span/input")

3. xpath做布尔逻辑运算

例⼦：find_element_by_xpath("//div[@id='hd' or @name='q']")

4. 双条件同时过滤

find_element_by_xpath("//div[@id='hd'][@name='q'")

5.⽬录元素存在层级关系

例1： find_element_by_xpath("//ul[@class='app-list']/li[contains(@class,'safe')]/div")

例2：定位上⼀层再定位⽬标元素（定位dl再定位dt）

find_element_by_xpath("//form[@id='J_login_form]/dl/dt/input[@id='J_password']")

6. 模糊定位

6.1 contains ⽅法(包含）

find_element_by_xpath("//a[contains(@name,'trnews')]")

6.2 start-with⽅法（以XX开头）

find_element_by_xpath("//a[start-with(@href,'http')]")

6.3 text⽅法

find_element_by_xpath("//a[contains(text(),'新闻')]") 查超链接元素的⽂本内容

find_element_by_xpath("//*[text()='新闻']") 查所有内容为退出⼆字的元素

注意：元素属性值有空格时，尽量不使⽤带空格，可⽤contains等其他⽅法，避开空格

3.python操作json和csv

JSON

zabbix使用手册JSON(JavaScript Object Notation, JS 对象标记)是⼀种轻量级的数据交换格式，易于⼈阅读和编写，同时也易于机器解析和⽣成，并有效地提升⽹络传输效率。

php语言自学

它基于ECMAScript(w3c制定的js规范)的⼀个⼦集，采⽤完全独⽴于编程语⾔的⽂本格式来存储和表⽰数据。简洁和清晰的层次结构使得JSON成为理想的数据交换语⾔。

JSON⽀持数据格式：

对象（字典）。使⽤花括号。

数组（列表）。使⽤⽅括号。

整形、浮点型、布尔类型还有null类型。

字符串类型（字符串必须要⽤双引号，不能⽤单引号）。

多个数据之间使⽤逗号分开。注：json本质上就是⼀个字符串。

JSON函数

使⽤JSON函数需要导⼊json库：import json。

函数描述

json.dumps将Python对象编码成JSON字符串

json.loads将已编码的JSON字符串解码为Python对象

另外：

json.dump()和json.load()主要⽤来读写json⽂件函数。

字典和列表转JSON

import json

books = [

{

'title': 'Python基础',

'price': '79.00'

{

oracle小数用什么类型'title': 'Scrapy⽹络爬⾍',

'price': '56.00'

}

]

json_str = json.dumps(books)

print('type: ', type(json_str))

kindeditor编辑器引入

print('json_str: ', json_str)

# 输出：

type: <class 'str'>

json_str: [{"title": "Python\u57fa\u7840", "price": "79.00"}, {"title": "Scrapy\u7f51\u7edc\u722c\u866b", "price": "56.00"}]

注：因为json在dump的时候，只能存放ASCII的字符，因此会将中⽂进⾏转义，这时候我们可以使⽤ensure_ascii=False关闭这个特性。更改之后：

json_str = json.dumps(books, ensure_ascii=False)

# 输出：

[{"title": "Python基础", "price": "79.00"}, {"title": "Scrapy⽹络爬⾍", "price": "56.00"}]

注：Python中，只有基本数据类型才能转换成JSON格式的字符串，即：int、float、str、list、dict、tuple。

将json数据直接dump到⽂件中

常规⽅式：

import json

books = [

{

'title': 'Python基础',

'price': '79.00'

{

'title': 'Scrapy⽹络爬⾍',

'price': '56.00'

}

]

json_str = json.dumps(books, ensure_ascii=False)

with open('books.json', 'w') as fp:

fp.write(json_str)

打开books.json⽂件发现出现了乱码：

[{"title": "Python��", "price": "79.00"}, {"title": "Scrapy��", "price": "56.00"}]

然后指定⽂件编码⽅式：

with open('books.json', 'w', encoding='utf8') as fp:

fp.write(json_str)

重新打开books.json⽂件发现⼀切正常：

[{"title": "Python基础", "price": "79.00"}, {"title": "Scrapy⽹络爬⾍", "price": "56.00"}]

json模块中除了dumps函数，还有⼀个dump函数，这个函数可以传⼊⼀个⽂件指针，直接将字符串du

mp到⽂件中。import json

books = [

{

'title': 'Python基础',

'price': '79.00'

{

'title': 'Scrapy⽹络爬⾍',

'price': '56.00'

}

]

with open('books.json', 'w', encoding='utf8') as fp:

json.dump(books, fp)

# 输出：

[{"title": "Python\u57fa\u7840", "price": "79.00"}, {"title": "Scrapy\u7f51\u7edc\u722c\u866b", "price": "56.00"}]

关闭中⽂转义：

688IT编程网

...元素定位方式Xpath总结、python操作json和。。。

发表评论

推荐文章

应用程序的安全检测方法、装置、电子设备和存储介质

nginx map用法正则

VBA之正则表达式(1)--基础篇

Prometheus监控学习笔记之初识PromQL

关于PHP中的webshell

热门文章

一种任意人头与任意人体的3D结合方法

正则匹配c语言中8进制

fortran数据格式

python中文本转数字用的公式

gh 文本变数值

js判断输入是否为正整数、浮点数等数字的函数代码

qt浮点数正则表达式

QT正则表达式限制输入值

手机号码和电话号码的正则表达式

str转浮点-概述说明以及解释

英豪结尾的诗句

Java正则表达式:符合以特定字符串开头,以特定字符串结尾的所有结果

machinebuilder使用手册

ASP.NET网站建设基本常用代码

LCD显示实时时钟

经纬度正则表达式解析

前端科学计数法转数字

python正则表达式re之compile函数解析

pythonunittest之断言及示例

[lua]lua中匹配字符串小数

最新文章

nginx map用法正则

Prometheus监控学习笔记之初识PromQL

关于PHP中的webshell

python中re.findall函数实例用法

nginx url表达式

nginx 正则匹配参数

标签列表

688IT编程网

...元素定位方式Xpath总结、python操作json和。。。

发表评论

推荐文章

应用程序的安全检测方法、装置、电子设备和存储介质

nginx map用法 正则

VBA之正则表达式(1)--基础篇

Prometheus监控学习笔记之初识PromQL

关于PHP中的webshell

热门文章

一种任意人头与任意人体的3D结合方法

正则匹配c语言中8进制

fortran数据格式

python中文本转数字用的公式

gh 文本变数值

js判断输入是否为正整数、浮点数等数字的函数代码

qt浮点数正则表达式

QT正则表达式限制输入值

手机号码和电话号码的正则表达式

str转浮点-概述说明以及解释

英豪结尾的诗句

Java正则表达式:符合以特定字符串开头,以特定字符串结尾的所有结果

machinebuilder使用手册

ASP.NET网站建设基本常用代码

LCD显示实时时钟

经纬度正则表达式解析

前端科学计数法转数字

python正则表达式re之compile函数解析

pythonunittest之断言及示例

[lua]lua中匹配字符串小数

最新文章

nginx map用法 正则

Prometheus监控学习笔记之初识PromQL

关于PHP中的webshell

python中re.findall函数实例用法

nginx url表达式

nginx 正则匹配参数

标签列表

nginx map用法正则

nginx map用法正则