【发布时间】:2020-01-21 05:25:38
【问题描述】:
string emailBody = "sample text for NewFinancial History:\"xyz\" text NewFinancial History:\"abc\" NewEBTDI$:\"abc\" ds \"NewFinancial History:pqr\" test";
private Dictionary<string, List<string>> ExtractFieldValuesForDynamicListObject(string emailBody)
{
Dictionary<string, List<string>> paramValueList = new Dictionary<string, List<string>>();
try
{
emailBody = ReplaceIncompatableQuotes(emailBody);
emailBody = string.Join(" ", Regex.Split(emailBody.Trim(), @"(?:\r\n|\n|\r)"));
var keys = Regex.Matches(emailBody, @"\bNew\B(.+?):", RegexOptions.Singleline).OfType<Match>().Select(m => m.Groups[0].Value.Replace(":", "")).Distinct().ToArray();
foreach (string key in keys)
{
List<string> valueList = new List<string>();
string regex = "" + Regex.Escape(key) + ":" + "\"(?<" + Regex.Escape(GetCleanKey(key)) + ">[^\"]*)\"";
var matches = Regex.Matches(emailBody, regex, RegexOptions.Singleline);
foreach (Match match in matches)
{
if (match.Success)
{
string value = match.Groups[Regex.Escape(GetCleanKey(key))].Value;
if (!valueList.Contains(value.Trim()))
{
valueList.Add(value.Trim());
}
}
}
valueList = valueList.Distinct().ToList();
string listName = key.Replace("New", "");
paramValueList.Add(listName.Trim(), valueList);
}
}
catch (Exception ex)
{
DCULSLogger.LogError(ex);
}
return paramValueList;
}
我的目标是扫描电子邮件正文并使用NewListName:"Value" 命名法识别字符串,使用上述正则表达式和方法可以正常工作。现在我的客户已将命名法从 NewListName:"Value" 更改为 "NewListName:Value"。我想抓取双引号之间的文本以及 New: 关键字。所以我需要寻找 "New 关键字和结束引号。谁能帮我修改上面的正则表达式以扫描电子邮件正文并获取双引号之间的所有值列表。所以在上面的例子中,我想在我的结果中获取\"NewFinancial History:pqr\"。任何帮助,将不胜感激。
【问题讨论】:
-
试试
var keys = Regex.Matches(emailBody, @"""New[^"":]+:[^""]+""", RegexOptions.Singleline).OfType<Match>().Select(m => m.Value).Distinct().ToArray(); -
成功了。谢谢 !!!你能解释一下正则表达式吗?
-
太好了,请检查下面的答案,我发布了解释。