C# 使用 HttpWebResponse 解析特殊字符答案

【问题标题】：C# parsing special Character with HttpWebResponseC# 使用 HttpWebResponse 解析特殊字符
【发布时间】：2013-05-23 22:36:57
【问题描述】：

我正在解析 HTML 站点以在 C# 客户端中使用数据。不幸的是，我的 HTTPresponse 弄乱了所有特殊字符（如法语名称）并用问号“？”替换它们。我可以做些什么来解决我的问题？

这是我的代码：

private void LoadData()
{
    String strBaseURL = @"http://here_goes_the_url.com/";
    StringBuilder sb = new StringBuilder();
    byte[] buf = new byte[8192];
    HttpWebRequest request = (HttpWebRequest)
    WebRequest.Create(strBaseURL);
    HttpWebResponse response = (HttpWebResponse)
    request.GetResponse();
    Stream resStream = response.GetResponseStream();

    string tempString = null;
    int count = 0;

    do
    {
        count = resStream.Read(buf, 0, buf.Length);
        if (count != 0)
        {
            tempString = Encoding.ASCII.GetString(buf, 0, count);
            sb.Append(tempString);
        }
    }
    while (count > 0);
    result = sb.ToString();
}

我尝试更改编码，但没有任何结果:(

谢谢！

【问题讨论】：

标签： c# html parsing special-characters

【解决方案1】：

你必须使用 ASCII 以外的东西。

试试这个，例如：

tempString = Encoding.UTF8.GetString(buf, 0, count);

原因是ASCII 编码仅涵盖 127 位字符集，而 UTF8 涵盖 Unicode 字符集中的所有字符。

【讨论】：

成功了！非常感谢，我先尝试了 UniCode 和 UTF32，但没用，UTF8，我可以看到所有字符 :)