Python正则表达式详解(超详细,看完必会!)
正则表达式详解
正则表达式 英⽂名称叫 Regular Expression简称RegEx,是⽤来匹配字符的⼀种⼯具,它常被⽤在⽹页爬⾍,⽂稿整理,数据筛选等⽅⾯,最常⽤的就是⽤在⽹页爬⾍,数据抓取。
python正则表达式不包含⼀、正则表达式的各种符号解释
(来⾃)~~~
是不是感觉太多了,因此我将常⽤的整理出来了
⼆、进⾏逐个详解
1.⾸先导⼊模块
import re
2.匹配多种可能 使⽤ []
#'run' or 'ran'
res = re.search(r'r[au]n','dog runs to cat')
print(res)
>>><re.Match object; span=(4,7), match='run'>
res = re.search(r'r[au]n','dog rans to cat')
print(res)
>>><re.Match object; span=(4,7), match='ran'>
#continue 匹配更多种可能
res = re.search(r'r[A-Z]n','dog rans to cat')
print(res)
>>>None
res = re.search(r'r[a-z]n','dog rans to cat')
print(res)
>>><re.Match object; span=(4,7), match='ran'>
res = re.search(r'r[0-9]n','dog rans to cat')
print(res)
>>>None
res = re.search(r'r[0-9a-z]n','dog rans to cat')
print(res)
>>><re.Match object; span=(4,7), match='ran'>
3.匹配数字 \d and \D
# \d : decimal digit 数字的
res = re.search(r'r\dn','run r9n')
print(res)
>>><re.Match object; span=(4,7), match='r9n'>
# \D : any non-decimal digit 任何不是数字的
res = re.search(r'r\Dn','run r9n')
print(res)
>>><re.Match object; span=(0,3), match='run'>
4.匹配空⽩ \s and \S
# \s : any white space [\t \n \r \f \v]
res = re.search(r'r\sn','r\nn r9n')
print(res)
>>><re.Match object; span=(0,3), match='r\nn'>
# \S : 和\s相反,any non-white space
res = re.search(r'r\Sn','r\nn r9n')
print(res)
>>><re.Match object; span=(4,7), match='r9n'>
5.匹配所有的字母和数字以及"_" \w and \W
# \w : [a-zA-Z0-9_]
res = re.search(r'r\wn','r\nn r9n')
print(res)
>>><re.Match object; span=(4,7), match='r9n'>
# \W : opposite to \w 即与\w相反
res = re.search(r'r\Wn','r\nn r9n')
print(res)
>>><re.Match object; span=(0,3), match='r\nn'>
6.匹配空⽩字符 \b and \B
# \b : (only at the start or end of the word)
res = re.search(r'\bruns\b','dog runs to cat')
print(res)
>>><re.Match object; span=(4,8), match='runs'> res = re.search(r'\bruns\b','dogrunsto cat')
print(res)
>>>None
# \B : ( but not at the start or end of the word)
res = re.search(r'\Bruns\B','dog runs to cat')
print(res)
>>>None
res = re.search(r'\Bruns\B','dogrunsto cat')
print(res)
>>><re.Match object; span=(5,11), match=' runs '> 7.匹配特殊字符任意字符 \ and .
# \\ : 匹配 \
res = re.search(r'runs\\','dog runs\ to cat')
print(res)
>>><re.Match object; span=(4,9), match='runs\\'> # . : 匹配 anything (except \n)
res = re.search(r'r.ns','dog r;ns to cat')
print(res)
>>><re.Match object; span=(4,8), match='r;ns'>
>res = re.search(r'r.ns','dog r\nns to cat')
print(res)
>>>None
8.匹配句尾句⾸ $ and ^
# ^ : 匹配line beginning
res = re.search(r'^runs','dog runs to cat')
print(res)
>>>None
res = re.search(r'^dog','dog runs to cat')
print(res)
>>><re.Match object; span=(0,3), match='dog'> # $ : 匹配line ending
res = re.search(r'runs$','dog runs to cat')
print(res)
>>>None
res = re.search(r'cat$','dog runs to cat')
print(res)
>>><re.Match object; span=(12,15), match='cat'> 9. 是否匹配?
# ?: may or may nt occur
res = re.search(r'r(u)?ns','dog runs to cat')
print(res)
>>><re.Match object; span=(4,8), match='runs'>
res = re.search(r'r(u)?ns','dog rns to cat')
print(res)
>>><re.Match object; span=(4,7), match='rns'> 10. 多⾏匹配 re.M
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论