Java正则表达式教程及示例--688IT编程网

Java正则表达式教程及⽰例

本⽂由 - 翻译⾃。欢迎加⼊。转载请见⽂末要求。

【感谢的热⼼翻译。如果其他朋友也有不错的原创或译⽂，可以尝试。】

当我开始我的Java职业⽣涯的时候，对于我来说正则表达式简直是个是梦魇。本教程旨在帮助你驾驭Java正则表达式，同时也帮助我复习正则表达式。

什么是正则表达式?

正则表达式定义了字符串的模式。正则表达式可以⽤来搜索、编辑或处理⽂本。正则表达式并不仅限于某⼀种语⾔，但是在每种语⾔中有细微的差别。Java正则表达式和Perl的是最为相似的。

Java正则表达式的类在包中，包括三个类：Pattern,Matcher 和 PatternSyntaxException。

1. Pattern对象是正则表达式的已编译版本。他没有任何公共构造器，我们通过传递⼀个正则表达式参数给公共静态⽅法compile 来创建

⼀个pattern对象。

2. Matcher是⽤来匹配输⼊字符串和创建的 pattern 对象的正则引擎对象。这个类没有任何公共构造器，我们⽤patten对象的matcher⽅

法，使⽤输⼊字符串作为参数来获得⼀个Matcher对象。然后使⽤matches⽅法，通过返回的布尔值判断输⼊字符串是否与正则匹配。

3. 如果正则表达式语法不正确将抛出PatternSyntaxException异常。

让我们在⼀个简单的例⼦⾥看看这些类是怎么⽤的吧

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33package com.journaldev.util;

import Matcher;

import Pattern;

public class RegexExamples {

public static void main(String[] args) {

// using pattern with flags

Pattern pattern = Patternpile("ab", Pattern.CASE_INSENSITIVE); Matcher matcher = pattern.matcher("ABcabdAb");

// using Matcher find(), group(), start() and end() methods

while(matcher.find()) {

System.out.println("Found the text \""+ up()

+ "\" starting at "+ matcher.start()

+ " index and ending at index "+ d());

}

// using Pattern split() method

pattern = Patternpile("\\W");

String[] words = pattern.split("one@two#three:four$five");

for(String s : words) {

System.out.println("Split using Pattern.split(): "+ s);

}

// placeFirst() and replaceAll() methods

pattern = Patternpile("1*2");

matcher = pattern.matcher("11234512678");

System.out.println("Using replaceAll: "+ placeAll("_"));

System.out.println("Using replaceFirst: "+ placeFirst("_")); }

}

时间正则表达式java上述程序的输出是：

Input String matches regex - true

Exception in thread "main" PatternSyntaxException: Dangling meta character '*' near index 0 *xx*

at (Pattern.java:1924)

at Pattern.sequence(Pattern.java:2090)

at pr(Pattern.java:1964)

at Patternpile(Pattern.java:1665)

at Pattern.(Pattern.java:1337)

at Patternpile(Pattern.java:1022)

at com.journaldev.util.PatternExample.main(PatternExample.java:13)

既然正则表达式总是和字符串有关， Java 1.4对String类进⾏了扩展，提供了⼀个matches⽅法来匹配pattern。在⽅法内部使⽤Pattern和Matcher类来处理这些东西，但显然这样减少了代码的⾏数。

Pattern类同样有matches⽅法，可以让正则和作为参数输⼊的字符串匹配，输出布尔值结果。

下述的代码可以将输⼊字符串和正则表达式进⾏匹配。

1 2 3String str = "bbb";

System.out.println("Using String matches method: "+str.matches(".bb")); System.out.println("Using Pattern matches method: "+Pattern.matches(".bb", str));

所以如果你的需要仅仅是检查输⼊字符串是否和pattern匹配，你可以通过调⽤String的matches⽅法省下时间。只有当你需要操作输⼊字符串或者重⽤pattern的时候，你才需要使⽤Pattern和Matches类。

注意由正则定义的pattern是从左⾄右应⽤的，⼀旦⼀个原字符在⼀次匹配中使⽤过了，将不会再次使⽤。

例如，正则“121”只会匹配两次字符串“31212142121″，就像这样“_121____121″。

正则表达式通⽤匹配符号

正则表达式说明⽰例

.Matches any single sign, includes

everything

匹配任何单个符号，包括所有字符

(“..”, “a%”) – true(“..”, “.a”) – true

(“..”, “a”) – false

^xxx在开头匹配正则xxx (“^a.c.”, “abcd”) – true(“^a”, “a”) – true

(“^a”, “ac”) – false

xxx$在结尾匹配正则xxx (“..cd”, “abcd”) – true(“a”, “a”) – true (“a$”, “aca”) – false

[abc]能够匹配字母a,b或c。[]被称为

character classes。

(“^[abc]d.”, “ad9″) – true(“[ab].d$”, “bad”) – true

(“[ab]x”, “cx”) – false

[abc][12]能够匹配由1或2跟着的a,b或c (“[ab][12].”, “a2#”) – true(“[ab]..[12]“, “acd2″) – true

(“[ab][12]“, “c2″) – false

[^abc]

当^是[]中的第⼀个字符时代表取

反，匹配除了a,b或c之外的任意字

符。

(“[^ab][^12].”, “c3#”) – true(“[^ab]..[^12]“, “xcd3″) –

true

(“[^ab][^12]“, “c2″) – false

[a-e1-8]匹配a到e或者1到8之间的字符(“[a-e1-3].”, “d#”) – true(“[a-e1-3]“, “2″) – true (“[a-e1-3]“, “f2″) – false

xx|yy匹配正则xx或者yy (“x.|y”, “xa”) – true(“x.|y”, “y”) – true (“x.|y”, “yz”) – false

Java正则表达式元字符

正则表达式说明

\d任意数字，等同于[0-9]

\D任意⾮数字，等同于[^0-9]

\s任意空⽩字符，等同于[\t\n\x0B\f\r]

\S任意⾮空⽩字符，等同于[^\s]

\w任意英⽂字符，等同于[a-zA-Z_0-9] \W任意⾮英⽂字符，等同于[^\w]

\b单词边界

\B⾮单词边界

有两种⽅法可以在正则表达式中像⼀般字符⼀样使⽤元字符。

1. 在元字符前添加反斜杠(\)

2. 将元字符置于\Q(开始引⽤)和\E(结束引⽤)间

正则表达式量词

量词指定了字符匹配的发⽣次数。

正则表达式说明

x?x没有出现或者只出现⼀次

X*X出现0次或更多

X+X出现1次或更多

X{n}X正好出现n次

X{n,}X出席n次或更多

X{n,m}X出现⾄少n次但不多于m次

量词可以和character classes和capturing group⼀起使⽤。

例如，[abc]+表⽰a,b或c出现⼀次或者多次。

(abc)+表⽰capturing group “abc”出现⼀次或多次。我们即将讨论capturing group。

正则表达式capturing group

Capturing group是⽤来对付作为⼀个整体出现的多个字符。你可以通过使⽤()来建⽴⼀个group。输⼊字符串中和capturing group相匹配的部分将保存在内存⾥，并且可以通过使⽤Backreference调⽤。

你可以使⽤upCount⽅法来获得⼀个正则pattern中capturing groups的数⽬。例如((a)(bc))包含3个capturing groups; ((a)(bc)), (a)和 (bc)。

你可以使⽤在正则表达式中使⽤Backreference，⼀个反斜杠(\)接要调⽤的group号码。

Capturing groups和Backreferences可能很令⼈困惑，所以我们通过⼀个例⼦来理解。

1 2 3 4System.out.println(Pattern.matches("(\\w\\d)\\1", "a2a2")); //true

System.out.println(Pattern.matches("(\\w\\d)\\1", "a2b2")); //false

System.out.println(Pattern.matches("(AB)(B\\d)\\2\\1", "ABB2B2AB")); //true System.out.println(Pattern.matches("(AB)(B\\d)\\2\\1", "ABB2B3AB")); //false

在第⼀个例⼦⾥，运⾏的时候第⼀个capturing group是(\w\d)，在和输⼊字符串“a2a2″匹配的时候获取“a2″并保存到内存⾥。因此\1是”a2”的引⽤，并且返回true。基于相同的原因，第⼆⾏代码打印false。

试着⾃⼰理解第三⾏和第四⾏代码。:)

现在我们来看看Pattern和Matcher类中⼀些重要的⽅法。

我们可以创建⼀个带有标志的Pattern对象。例如Pattern.CASE_INSENSITIVE可以进⾏⼤⼩写不敏感的匹配。Pattern类同样提供了和String 类相似的split(String)⽅法

Pattern类toString()⽅法返回被编译成这个pattern的正则表达式字符串。

Matcher类有start()和end()索引⽅法，他们可以显⽰从输⼊字符串中匹配到的准确位置。

Matcher类同样提供了字符串操作⽅法replaceAll(String replacement)和replaceFirst(String replacement)。

现在我们在⼀个简单的java类中看看这些函数是怎么⽤的。

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18package com.journaldev.util;

import Matcher;

import Pattern;

public class RegexExamples {

public static void main(String[] args) {

// using pattern with flags

Pattern pattern = Patternpile("ab", Pattern.CASE_INSENSITIVE); Matcher matcher = pattern.matcher("ABcabdAb");

// using Matcher find(), group(), start() and end() methods

while(matcher.find()) {

System.out.println("Found the text \""+ up()

+ "\" starting at "+ matcher.start()

+ " index and ending at index "+ d());

}

20 21 22 23 24 25 26 27 28 29 30 31 32 33 // using Pattern split() method

pattern = Patternpile("\\W");

String[] words = pattern.split("one@two#three:four$five");

for(String s : words) {

System.out.println("Split using Pattern.split(): "+ s);

}

// placeFirst() and replaceAll() methods

pattern = Patternpile("1*2");

matcher = pattern.matcher("11234512678");

System.out.println("Using replaceAll: "+ placeAll("_"));

System.out.println("Using replaceFirst: "+ placeFirst("_")); }

}

上述程序的输出：

1 2 3 4 5 6 7 8 9 10Found the text "AB"starting at 0index and ending at index 2 Found the text "ab"starting at 3index and ending at index 5 Found the text "Ab"starting at 6index and ending at index 8 Split using Pattern.split(): one

Split using Pattern.split(): two

Split using Pattern.split(): three

Split using Pattern.split(): four

Split using Pattern.split(): five

Using replaceAll: _345_678

Using replaceFirst: _34512678

原⽂链接：翻译： -

译⽂链接：

Loading [MathJax]/jax/output/HTML-CSS/fonts/TeX/fontdata.js

688IT编程网

Java正则表达式教程及示例

发表评论

推荐文章

java正则表达式选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额正则表达式

提取文本中数字的函数

热门文章

利用正则表达式实现文本数据提取与处理

正则表达式零宽断言详解

文本匹配规则

excel中使用正则

1-31正则表达式

anki之高级筛选

BUAA_OO_2021_第一单元总结

insert语句递增写法

sublime text 3在行前插入递增数字序号的方法

字符串只允许数字和英文的正则

powerbuilder 正则表达式

Shell脚本编写的高级技巧利用正则表达式进行字符串匹配

JAVA正则表达式的三种模式:贪婪,勉强和占有的讨论

go regexp匹配规则

oracle regexp_substr 实现原理

基本的元字符回溯引用和前后查匹配模式

elasticsearch query dsl正则

oracle sql正则表达式

GA-设置目标

仅匹配全角片假名的正则表达式

最新文章

java正则表达式选择题

工龄小数点提取

非零金额正则表达式

提取文本中数字的函数

vue数字相加小数点变长-概述说明以及解释

vue validate 正则验证小数长度

标签列表

688IT编程网

Java正则表达式教程及示例

发表评论

推荐文章

java正则表达式 选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额 正则表达式

提取文本中数字的函数

热门文章

利用正则表达式实现文本数据提取与处理

正则表达式零宽断言详解

文本匹配规则

excel中使用正则

1-31正则表达式

anki之高级筛选

BUAA_OO_2021_第一单元总结

insert语句递增写法

sublime text 3在行前插入递增数字序号的方法

字符串只允许数字和英文的正则

powerbuilder 正则表达式

Shell脚本编写的高级技巧利用正则表达式进行字符串匹配

JAVA正则表达式的三种模式:贪婪,勉强和占有的讨论

go regexp匹配规则

oracle regexp_substr 实现原理

基本的元字符 回溯引用和前后查 匹配模式

elasticsearch query dsl正则

oracle sql正则表达式

GA-设置目标

仅匹配全角片假名的正则表达式

最新文章

java正则表达式 选择题

工龄小数点提取

非零金额 正则表达式

提取文本中数字的函数

vue数字相加小数点变长-概述说明以及解释

vue validate 正则验证小数长度

标签列表

java正则表达式选择题

非零金额正则表达式

基本的元字符回溯引用和前后查匹配模式

java正则表达式选择题

非零金额正则表达式