【发布时间】:2020-08-02 05:25:57
【问题描述】:
我有一个 .XML 文件(这是我的程序制作的日志),其中包含以下文本:
<?xml version="1.0" encoding="utf-8"?>
<PsnRecords>
<PsnRecord>
<Names></Names>
<PsnUrl>http://gs2.ww.prod.dl.playstation.net/gs2/ppkgo/prod/CUSA05330_00/108/f_acb1a312a982305e284718898b3dade6afb395e6718d836b1d7b1e1aa1873800/f/EP0953-CUSA05330_00-BRAWLHALLAEUROPE-A0403-V0100-DP.pkg</PsnUrl>
<LocalUrl>C:\Users\Betrisa\Desktop\Shared\EP0953-CUSA05330_00-BRAWLHALLAEUROPE-A0403-V0100-DP.pkg</LocalUrl>
<isLixian>false</isLixian>
<LixianUrl></LixianUrl>
</PsnRecord>
<PsnRecord>
<Names></Names>
<PsnUrl>http://gs2.ww.prod.dl.playstation.net/gs2/ppkgo/prod/CUSA05330_00/108/f_acb1a312a982305e284718898b3dade6afb395e6718d836b1d7b1e1aa1873800/f/EP0953-CUSA05330_00-BRAWLHALLAEUROPE-A0403-V0100.pkg?downloadId=0000015b&du=000000000000015b00e26bd28904ee7f&product=0187&serverIpAddr=192.168.137.1&r=00000000</PsnUrl>
<LocalUrl></LocalUrl>
<isLixian>false</isLixian>
<LixianUrl></LixianUrl>
</PsnRecord>
<PsnRecord>
<Names></Names>
<PsnUrl>http://ic.97f46e00.060798.gs2.sonycoment.loris-e.llnwd.net/gs2/ppkgo/prod/CUSA05330_00/108/f_acb1a312a982305e284718898b3dade6afb395e6718d836b1d7b1e1aa1873800/f/EP0953-CUSA05330_00-BRAWLHALLAEUROPE-A0403-V0100.pkg?downloadId=0000015b&du=000000000000015b00e26bd28904ee7f&product=0187&serverIpAddr=192.168.137.1&r=00000001</PsnUrl>
<LocalUrl></LocalUrl>
<isLixian>false</isLixian>
<LixianUrl></LixianUrl>
</PsnRecord>
</PsnRecords>
我想获取所有 URL 链接并将它们保存到 .TXT 文件中。 我尝试了 2 种方法,但没有奏效:
方式1:使用Split(结果为:Url)
private void button1_Click(object sender, EventArgs e)
{
string paths = Application.StartupPath + @"\DataFiles\DataHistory.xml";
string resPaths = Application.StartupPath + @"\DataFiles\Links.txt";
StreamWriter urlsWrite = File.CreateText(resPaths);
var text = System.IO.File.ReadAllText(paths);
var links = text.Split("\t\n ".ToCharArray(), StringSplitOptions.RemoveEmptyEntries).Where(s => s.StartsWith("<PsnUrl>http://") || s.StartsWith("<PsnUrl>https://"));
foreach (string s in links)
{
urlsWrite.WriteLine(s);
}
}
方式2:使用正则表达式(结果什么都没有!!)
private void button1_Click(object sender, EventArgs e)
{
string paths = Application.StartupPath + @"\DataFiles\DataHistory.xml";
string resPaths = Application.StartupPath + @"\DataFiles\Links.txt";
StreamWriter urlsWrite = File.CreateText(resPaths);
var text = System.IO.File.ReadAllText(paths);
var regex = new Regex(@"\b(?:http?://|www\.)\S+\b", RegexOptions.Compiled | RegexOptions.IgnoreCase);
MatchCollection mactches = regex.Matches(text);
foreach (string matc in links)
{
text = text.Replace(matc.Value, "<PsnUrl>"+matc.Value+"</PsnUrl>");
urlsWrite.WriteLine(mats);
}
}
我想要一个包含干净 URL 的 .TXT 文件,例如:
https://xxxxxxxxxxxxxx
http://xxxxxxxxxxxxxx
https://xxxxxxxxxxxxxx
https://xxxxxxxxxxxxxx
https://xxxxxxxxxxxxxx
https://xxxxxxxxxxxxxx
我做错了什么?
【问题讨论】:
-
使用一些适当的方法来解析 XML。看看here 开始吧。)
-
当您今天早些时候提出这个问题时,我建议您研究一下 XPath。正如其他人所建议的那样,将 XML 视为 XML。它被设计成易于被 XML 解析器解析。
-
@Flydog57 我是这个网站的新手!管理员关闭了我的帖子因为规则!所以谢谢你和其他人的帮助,你是对的 Parse XML 是最简单的方法
标签: c#