python的re模块使用方法详解--688IT编程网

python的re模块使⽤⽅法详解

⼀、正则表达式的特殊字符介绍

正则表达式

^ 匹配⾏⾸

$ 匹配⾏尾

. 任意单个字符

[] 匹配包含在中括号中的任意字符

[^] 匹配包含在中括号中的字符之外的字符

[-] 匹配指定范围的任意单个字符

匹配之前项的1次或者0次

+ 匹配之前项的1次或者多次

* 匹配之前项的0次或者多次

{n} 匹配之前项的n次

{m,n} 匹配之前项最⼤n次，最⼩m次

{n,} 配置之前项⾄少n次

⼆、re模块的⽅法介绍

1、匹配类⽅法

a、findall⽅法

# findall⽅法，该⽅法在字符串中查模式匹配，将所有的匹配字符串以列表的形式返回，如果⽂本中没有任何字符串匹配模式，则返回⼀个空的列表，# 如果有⼀个⼦字符串匹配模式，则返回包含⼀个元素的列表，所以，⽆论怎么匹配，我们都可以直接遍历findall返回的结果⽽不会出错，这对⼯程师# 编写程序来说，减少了异常情况的处理，代码逻辑更加简洁

# re.findall() ⽤来输出所有符合模式匹配的⼦串

re_str = "hello this is python 2.7.13 and python 3.4.5"

pattern = "python [0-9]\.[0-9]\.[0-9]"

res = re.findall(pattern=pattern,string=re_str)

print(res)

# ['python 2.7.1', 'python 3.4.5']

pattern = "python [0-9]\.[0-9]\.[0-9]{2,}"

res = re.findall(pattern=pattern,string=re_str)

print(res)

# ['python 2.7.13']

pattern = "python[0-9]\.[0-9]\.[0-9]{2,}"

res = re.findall(pattern=pattern,string=re_str)

print(res)

# []

# re.findall() ⽅法，返回⼀个列表，如果匹配到的话，列表中的元素为匹配到的⼦字符串，如果没有匹配到，则返回⼀个空的列表

re_str = "hello this is python 2.7.13 and Python 3.4.5"

pattern = "python [0-9]\.[0-9]\.[0-9]"

res = re.findall(pattern=pattern,string=re_str,flags=re.IGNORECASE)

print(res)

# ['python 2.7.1', 'Python 3.4.5']

# 设置标志flags=re.IGNORECASE，意思为忽略⼤⼩写python正则表达式不包含

b、编译的⽅式使⽤正则表达式

# 我们⼀般采⽤编译的⽅式使⽤python的正则模块，如果在⼤量的数据量中，编译的⽅式使⽤正则性能会提⾼很多，具体读者们可以可以实际测试

re_str = "hello this is python 2.7.13 and Python 3.4.5"

re_obj = repile(pattern = "python [0-9]\.[0-9]\.[0-9]",flags=re.IGNORECASE)

res = re_obj.findall(re_str)

print(res)

c、match⽅法

# match⽅法，类似于字符串中的startwith⽅法，只是match应⽤在正则表达式中更加强⼤，更富有表现⼒，match函数⽤以匹配字符串的开始部分，如果模式# 匹配成功，返回⼀个SRE_Match类型的对象，如果模式匹配失败，则返回⼀个None，因此对于普通的前缀匹配，他的⽤法⼏乎和startwith⼀模⼀样，例如我# 们要判断data字符串是否以what和是否以数字开头

s_true = "what is a boy"

s_false = "What is a boy"

re_obj = repile("what")

print(re_obj.match(string=s_true))

# <_sre.SRE_Match object; span=(0, 4), match='what'

print(re_obj.match(string=s_false))

# None

s_true = "123what is a boy"

s_false = "what is a boy"

re_obj = repile("\d+")

print(re_obj.match(s_true))

# <_sre.SRE_Match object; span=(0, 3), match='123'>

print(re_obj.match(s_true).start())

# 0

print(re_obj.match(s_true).end())

# 3

print(re_obj.match(s_true).string)

# 123what is a boy

print(re_obj.match(s_true).group())

# 123

print(re_obj.match(s_false))

# None

d、search⽅法

# search⽅法，模式匹配成功后，也会返回⼀个SRE_Match对象，search⽅法和match的⽅法区别在于match只能从头开始匹配，⽽search可以从

# 字符串的任意位置开始匹配，他们的共同点是，如果匹配成功，返回⼀个SRE_Match对象，如果匹配失败，返回⼀个None，这⾥还要注意，

# search仅仅查第⼀次匹配，也就是说⼀个字符串中包含多个模式的匹配，也只会返回第⼀个匹配的结果，如果要返回所有的结果，最简单

# 的⽅法就是findall⽅法，也可以使⽤finditer⽅法

e、finditer⽅法

# finditer返回⼀个迭代器，遍历迭代器可以得到⼀个SRE_Match对象，⽐如下⾯的例⼦

re_str = "what is a different between python 2.7.14 and python 3.5.4"

re_obj = repile("\d{1,}\.\d{1,}\.\d{1,}")

for i in re_obj.finditer(re_str):

print(i)

# <_sre.SRE_Match object; span=(35, 41), match='2.7.14'>

# <_sre.SRE_Match object; span=(53, 58), match='3.5.4'>

2、修改类⽅法介绍

a、sub⽅法

# re模块sub⽅法类似于字符串中的replace⽅法，只是sub⽅法⽀持使⽤正则表达式，所以，re模块的sub⽅法使⽤场景更加⼴泛

re_str = "what is a different between python 2.7.14 and python 3.5.4"

re_obj = repile("\d{1,}\.\d{1,}\.\d{1,}")

print(re_obj.sub("a.b.c",re_str,count=1))

# what is a different between python a.b.c and python 3.5.4

print(re_obj.sub("a.b.c",re_str,count=2))

# what is a different between python a.b.c and python a.b.c

print(re_obj.sub("a.b.c",re_str))

# what is a different between python a.b.c and python a.b.c

b、split⽅法

# re模块的split⽅法和python字符串中的split⽅法功能是⼀样的，都是将⼀个字符串拆分成⼦字符串的列表，区别在于re模块的split⽅法能够

# 使⽤正则表达式

# ⽐如下⾯的例⼦，使⽤. 空格 : !分割字符串，返回的是⼀个列表

re_str = "what is a different between python 2.7.14 and python 3.5.4 USA:NewYork!Zidan.FRA"

re_obj = repile("[. :!]")

print(re_obj.split(re_str))

# ['what', 'is', 'a', 'different', 'between', 'python', '2', '7', '14', 'and', 'python', '3', '5', '4', 'USA', 'NewYork', 'Zidan', 'FRA']

c、⼤⼩写不敏感设置

# 3、⼤⼩写不敏感

# repile(flags=re.IGNORECASE)

d、⾮贪婪匹配

# 4、⾮贪婪匹配，贪婪匹配总是匹配到最长的那个字符串，相应的，⾮贪婪匹配是匹配到最⼩的那个字符串，只需要在匹配字符串的时候加⼀个？即可

# 下⾯的例⼦，注意两个.

s = "Beautiful is better than ugly.Explicit is better than impliciy."

re_obj = repile("Beautiful.*y\.")

print(re_obj.findall(s))

# ['Beautiful is better than ugly.Explicit is better than implicit.']

re_obj = repile("Beautiful.*?\.")

print(re_obj.findall(s))

# ['Beautiful is better than ugly.']

e、在正则匹配字符串中加⼀个⼩括号，会有什么的效果呢？

如果是要配置⼀个真正的⼩括号，那么就需要转义符，下⾯的例⼦⼤家仔细看下，注意下search⽅法返回的对象的group（1）这个⽅法是报错的

import re

s = "=aa1239d&&& 0a ()--"

# obj = repile("")

# search

# rep = obj.search(s)

# print(rep)

# <_sre.SRE_Match object; span=(15, 17), match='()'>

# up(1))

# IndexError: no such group

# up())

# ()

# findall

rep = obj.findall(s)

print(rep)

# ['()']

如果是要返回括号中匹配的字符串中，则该⼩括号不需要转义符，findall⽅法返回的是⼩伙好中匹配到的字符

串，up（）⽅法的返回的整个模式匹配到字符串，up(1)这个是匹配第⼀个⼩括号中的模式匹配到的字符串，up(2)这个是匹配第⼆个⼩括号中的模式匹配到的字符串，以此类推

s = "=aa1239d&&& 0a ()--"

rep = repile("\w+(&+)")

print(rep.findall(s))

# ['&&&']

print(rep.search(s).group())

# aa1239d&&&

print(rep.search(s).group(1))

# &&&

以上就是本⽂的全部内容，希望对⼤家的学习有所帮助，也希望⼤家多多⽀持。

688IT编程网

python的re模块使用方法详解

发表评论

推荐文章

java正则表达式选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额正则表达式

提取文本中数字的函数

热门文章

excel文字递增函数公式

数字递增公式

notepad 正则变量运算

C++regex库常用函数及实例

js正则表达式之前瞻后顾与非捕获分组

indesign正则数字和英文之间的空格

C#匹配中文字符串的4种正则表达式分享

PHP正则表达式匹配中文字符

匹配中文汉字的正则表达式介绍

Python正则表达式如何进行字符串替换

orcl中用正则表达式

sql正则表达式excel

dataframe正则表达式

postgress sql正则

el-upload accept 正则表达式

半小时正则表达式

判断科学计数法的正则

根据url判断静态资源的方法

Java正则表达式-匹配正负浮点数

替换模糊匹配正则-hive

最新文章

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

能被5整除的十进制整数的正规表达式

大于0小于等于1的正则表达式

linux grep 26个字母

java pattern 正则表达式

掌握文本编辑器中的搜索和替换技巧

标签列表

688IT编程网

python的re模块使用方法详解

发表评论

推荐文章

java正则表达式 选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额 正则表达式

提取文本中数字的函数

热门文章

excel文字递增函数公式

数字递增公式

notepad 正则变量运算

C++regex库常用函数及实例

js正则表达式之前瞻后顾与非捕获分组

indesign正则数字和英文之间的空格

C#匹配中文字符串的4种正则表达式分享

PHP正则表达式匹配中文字符

匹配中文汉字的正则表达式介绍

Python正则表达式如何进行字符串替换

orcl中用正则表达式

sql正则表达式excel

dataframe正则表达式

postgress sql正则

el-upload accept 正则表达式

半小时 正则表达式

判断科学计数法的正则

根据url判断静态资源的方法

Java正则表达式-匹配正负浮点数

替换模糊匹配正则-hive

最新文章

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

能被5整除的十进制整数的正规表达式

大于0小于等于1的正则表达式

linux grep 26个字母

java pattern 正则表达式

掌握文本编辑器中的搜索和替换技巧

标签列表

java正则表达式选择题

非零金额正则表达式

半小时正则表达式