【问题标题】:Javascript RegEx Remove Multiple words from stringJavascript RegEx 从字符串中删除多个单词
【发布时间】:2018-04-04 15:37:31
【问题描述】:

使用 Javascript。 (注意有一个类似的post,但 OP 请求 Java,这是用于 Javascript 的

我正在尝试从整个字符串中删除单词列表而不进行循环(最好使用正则表达式)。

这是我目前所拥有的,它删除了一些单词,但不是全部。有人可以帮助确定我在使用 RegEx 函数时做错了什么吗?

   //Remove all instances of the words in the array
  var removeUselessWords = function(txt) {

	var uselessWordsArray = 
        [
          "a", "at", "be", "can", "cant", "could", "couldnt", 
          "do", "does", "how", "i", "in", "is", "many", "much", "of", 
          "on", "or", "should", "shouldnt", "so", "such", "the", 
          "them", "they", "to", "us",  "we", "what", "who", "why", 
          "with", "wont", "would", "wouldnt", "you"
        ];
			
	var expStr = uselessWordsArray.join(" | ");
	return txt.replace(new RegExp(expStr, 'gi'), ' ');
  }

  var str = "The person is going on a walk in the park. The person told us to do what we need to do in the park";
  
  console.log(removeUselessWords(str));

//The result should be: "person going walk park. person told need park."

【问题讨论】:

  • 首先去掉|周围的空白。
  • 如果我这样做,那么该函数会删除所有字符而不是单词。 (即:“walk”将是“wlk”)
  • @Jared Smith,我收回我上面的声明,因为 RomanPerekhrest 使用了你的评论。

标签: javascript regex string


【解决方案1】:

三个瞬间:

  • | 连接数组项,不带空格
  • 将正则表达式替换组括在括号中(...|...)
  • 指定单词边界\b以匹配单独的单词

var removeUselessWords = function(txt) {
    var uselessWordsArray = 
        [
          "a", "at", "be", "can", "cant", "could", "couldnt", 
          "do", "does", "how", "i", "in", "is", "many", "much", "of", 
          "on", "or", "should", "shouldnt", "so", "such", "the", 
          "them", "they", "to", "us",  "we", "what", "who", "why", 
          "with", "wont", "would", "wouldnt", "you"
        ];
			
	  var expStr = uselessWordsArray.join("|");
	  return txt.replace(new RegExp('\\b(' + expStr + ')\\b', 'gi'), ' ')
                    .replace(/\s{2,}/g, ' ');
  }

var str = "The person is going on a walk in the park. The person told us to do what we need to do in the park";
  
console.log(removeUselessWords(str));

【讨论】:

  • 哇。这行得通。谢谢你。 \\b 意味着什么?
  • 这样更好!
  • 您还需要在所有这些末尾添加一个 .replace(/\s+\.(\s|$)/g, '.$1') 以清理句点之前的潜在空间。
【解决方案2】:

也许这就是你想要的:

   //Remove all instances of the words in the array
  var removeUselessWords = function(txt) {

	var uselessWordsArray = 
        [
          "a", "at", "be", "can", "cant", "could", "couldnt", 
          "do", "does", "how", "i", "in", "is", "many", "much", "of", 
          "on", "or", "should", "shouldnt", "so", "such", "the", 
          "them", "they", "to", "us",  "we", "what", "who", "why", 
          "with", "wont", "would", "wouldnt", "you"
        ];
			
	var expStr = uselessWordsArray.join("\\b|\\b");
	return txt.replace(new RegExp(expStr, 'gi'), '').trim().replace(/ +/g, ' ');
  }

  var str = "The person is going on a walk in the park. The person told us to do what we need to do in the park";
  
  console.log(removeUselessWords(str));

//The result should be: "person going walk park. person told need park."

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2017-07-17
    • 2013-01-17
    • 2014-11-21
    • 2012-03-15
    • 2014-05-22
    • 1970-01-01
    相关资源
    最近更新 更多