python字符串前⾯加r的问题,不会影响d这个转义字符
刚开始学python,遇到⼀个字符串前⾯加r的问题
理论上,字符串前⾯加r,会消除转义字符对字符串的影响
例:
s=r'\tt'
print(s)
Output:
'\tt'
s='\tt'
print(s)
Output:
'        t'
但是我发现对\d这个转义字符是没影响的
例如
import re
def re_method():
s ='kjiabc5ty'
print(re.search(r'abc\d',s).group())
if __name__ =='__main__':
re_method()
依然可以匹配到abc5,并输出
我百思不得其解
后来在⾕歌上搜到答案,⼤致意思说的\d不是有效的转义序列,所以python不会更改它,所以'\d' == r'\d'是对的。由于\\ 是有效的转义序列,因此将其更改为\,因此您得到了该⾏为'\d' == '\\d' == r'\d'。所以,字符串有时会造成混乱。
下⾯我粘贴⼀段原话
There is a distinction you have to make between the python interpreter and the re module.
In python, a backslash followed by a character can mean a special character if the string is not rawed. For instance, \n will mean a newline character, \r will mean a carriage return, \t will mean the tab character, \b represents a nondestructive backspace. By itself, \d in a python string does not mean anything special.
python正则表达式不包含In regex however, there are a bunch of characters that would otherwise not always mean anything in python. But that's the catch, 'not always'. One of the things that can be misinterpreted is \b which in python is a backspace, in regex means a word boundary. What this implies is that if you pass on an unrawed \b to the regular expression part of a regex, this \b gets substituted by the backspace before it is passed to the regex function and it won't mean a thing there. So you have to absolutely pass the b with its backslash and to do that, you either escape the backslash, or raw the string.
Back to your question regarding \d, \d has no special meaning whatsoever in python, so it remains untouched. The
same \d passed as a regular expression gets converted by the regex engine, which is a separate entity to the python interpreter.
翻译过来
您必须在python解释器和re模块之间进⾏区分。
在python中,如果未原始字符串,则反斜杠后跟⼀个字符可以表⽰⼀个特殊字符。例如,\n表⽰换⾏符,\r表⽰回车,\t表⽰制表符,\b表⽰⽆损退格键。就其本⾝⽽⾔,\d在python字符串中并不表⽰任何特殊含义。
但是在regex中,有⼀堆字符在python中并不总是意味着任何东西。但这很重要,“并⾮总是如此”。可能被误解的⼀件事是\b在python中是退格,在正则表达式中是单词边界。这意味着如果您将未展开\b的正则表达式部分传递给正则表达式,则在将其传递给regex函数之前,它\b会被退格键所替代,并且在此处不会有任何意义。因此,您必须绝对传递b带有反斜杠的,然后要么转义反斜杠,要么原始字符串。
回到关于的问题\d,\d在python中没有任何特殊含义,因此保持不变。同样\d为正则表达式通过得到由正则表达式引擎,这是⼀个单独的实体来python解释转换。
总之,我还是理解不太清楚,但是总算知道了有这回事。
再次记录⼀下,

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。