【发布时间】:2015-08-14 22:30:32
【问题描述】:
我在使用 OleDbConnection 从文本文件中读取非 ASCII 字符时遇到问题。有任何想法吗?
这是我用来复制问题的测试方法:
[TestMethod]
public void TestMethod1()
{
var arquivo = new FileInfo(@"P:\import.txt");
string connectionString = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=\"{0}\\\";Extended Properties=\"Text;IMEX=1;FMT=Delimited\"", arquivo.DirectoryName);
var conexaoFonteDados = new OleDbConnection(connectionString);
conexaoFonteDados.Open();
string instrucaoSql = "SELECT * FROM [" + arquivo.Name + "]";
var com = new OleDbCommand(instrucaoSql, conexaoFonteDados);
if (com.Connection.State != ConnectionState.Open)
{
com.Connection.Open();
}
var drDadosImportacao = com.ExecuteReader(CommandBehavior.CloseConnection);
while (drDadosImportacao != null && drDadosImportacao.Read())
{
object valorImportado = drDadosImportacao["Column"];
Console.WriteLine(valorImportado);
}
}
这是import.txt文件内容:
Column
a
b
ç
á
这是控制台的输出:
a
b
?
?
解决方案
如here 所述,您可以使用类似于此的方法将字符串转换为正确的编码:
public static class MyStringExtensions
{
private static readonly Encoding Iso = Encoding.GetEncoding("ISO-8859-1");
public static string RepairUtf8(this string value)
{
byte[] bytes = Iso.GetBytes(value);
return bytes.Any(o => o.Equals(195)) ? Encoding.UTF8.GetString(bytes) : value;
}
}
【问题讨论】: