【发布时间】:2015-11-17 15:46:59
【问题描述】:
昨晚我得到了一些帮助,想出一个正则表达式来捕获尽可能小的组。我需要获取一串歌词并在其中找到一个搜索词组。我遇到的问题是我无法让它看起来多行。
我有一个包含我读过的歌词的文本文件,这只是歌曲的一部分。 (括号不在文本文件中,我只是用它们来显示我要捕获的组。
The first [time we fall in love.
Love can be exciting, it can be a bloody bore.
Love can be a pleasure or nothing but a chore.
Love can be like a dull routine,
It can run you around until you're out of steam.
It can treat you well, it can treat you mean,
Love can mess you around,
Love can pick you up, it can bring you down].
But they'll never know The feelings we show
我使用正则表达式的短语是
time can bring you down
我使用字符串生成器来创建歌词字符串,然后歌词包含 \n 字符。我尝试做一个 replaceAll 来剥离它们,但它仍然没有用。如果我进入文本文件并只写一行说时间可以让你失望,它可以工作,但如果我把它写成两行它就不会。
我尝试在我的正则表达式中使用 \n 但它最终捕获了大部分歌曲,因为时间是第二个词。这是我目前正在尝试使用的正则表达式:
(?is)(\bTime\b)(?:(?!\n\b(?:time|can|bring|you|down)\b\n).)*(\bcan\b)(?:(?!\b(?:time|can|bring|you|down)\b).)*(\bbring\b)(?:(?!\b(?:time|can|bring|you|down)\b).)*(\byou\b)(?:(?!\b(?:time|can|bring|you|down)\b).)*(\bdown\b)
我试图捕捉歌词中括号中的内容。这是我正在使用的方法,它接收歌词和 searchPhrase 并返回它找到的字符串的长度。
static int rankPhrase(String lyrics, String lyricsPhrase){
//This takes in song lyrics and the phrase we are searching for
//Split the phrase up into separate words
String[] phrase = lyricsPhrase.split("[^a-zA-Z]+");
//Helper string for regex so we can get smallest grouping
String regexHelper = lyricsPhrase.replaceAll(" ","|").toLowerCase();
//Start to build the regex
StringBuilder regex = new StringBuilder("(?im)"+"(\\" + "b" + phrase[0] + "\\b)");
//loop through each word in the phrase
for(int i = 1; i < phrase.length; i++){
//add this to the regex we will search for
regex.append("(?:(?!\\b(?:" + regexHelper + ")\\b).)*(\\b" + phrase[i] + "\\b)");
}
//Create the pattern
Pattern p = Pattern.compile(regex.toString(), Pattern.DOTALL);
Matcher m = p.matcher(lyrics);
//string for regex match found
String regexMatch = "";
while(m.find()){
regexMatch = m.group();
System.out.println(regexMatch);
}
return regexMatch.length();
}
我将继续尝试并试图弄明白,我认为这是在正则表达式中工作的问题,但不是 100% 确定。谢谢!
【问题讨论】:
-
仍然无法使用正则表达式,有什么帮助吗?