【问题标题】:Convert DataTable columns to typed IEnumerable[]将 DataTable 列转换为类型化的 IEnumerable[]
【发布时间】:2016-11-29 23:03:27
【问题描述】:

如何将 DataTable 列转换为在 R.NET 中创建数据框所需的 IEnumerable[]

我有以下代码:

DataTable dt = CreateDateTable();
REngine e = REngine.GetInstance();                       
IEnumerable[] columns = new IEnumerable[dt.Columns.Count];                
string[] columnNames = dt.Columns.Cast<DataColumn>()
                       .Select(x => x.ColumnName)
                       .ToArray();

for(int i=0; i<dt.Columns.Count; i++)
    //This is the place where I am stuck. How to convert column to base type array instead of object array
    columns[i] = dt.Rows.Cast<DataRow>().Select(row => row[i]).ToArray();

DataFrame df = e.CreateDataFrame(columns: columns, 
 columnNames: columnNames, 
 stringsAsFactors: false);

我得到以下异常:

Test 'XXX.ReadResultsTest' failed: System.NotSupportedException : Cannot convert type System.Object[] to an R vector
       w RDotNet.REngineExtension.ToVector(REngine engine, IEnumerable values)
       w System.Array.ConvertAll[TInput,TOutput](TInput[] array, Converter`2 converter)
       w RDotNet.REngineExtension.CreateDataFrame(REngine engine, IEnumerable[] columns, String[] columnNames, String[] rowNames, Boolean checkRows, Boolean checkNames, Boolean stringsAsFactors)

DataTable 有不同类型的列,我不知道是什么类型,所以我不能像 double 这个例子那样做:

for (int i = 0; i < dt.Columns.Count; i++)
        columns[i] = dt.Rows.Cast<DataRow>().Select(row => row.Field<double>(i)).ToArray();

更新

到目前为止我有一个丑陋的解决方案,可以做得更好吗?

for (int i = 0; i < dt.Columns.Count; i++)
{
    switch (Type.GetTypeCode(dt.Columns[i].DataType))
    {
        case TypeCode.String:
            columns[i] = dt.Rows.Cast<DataRow>().Select(row => row.Field<string>(i)).ToArray();
            break;

        case TypeCode.Double:
            columns[i] = dt.Rows.Cast<DataRow>().Select(row => row.Field<double>(i)).ToArray();
            break;

        case TypeCode.Int32:
            columns[i] = dt.Rows.Cast<DataRow>().Select(row => row.Field<int>(i)).ToArray();
            break;

        case TypeCode.Int64:
            columns[i] = dt.Rows.Cast<DataRow>().Select(row => row.Field<long>(i)).ToArray();
            break;

        default:
            //columns[i] = dt.Rows.Cast<DataRow>().Select(row => row[i]).ToArray();
            throw new InvalidOperationException(String.Format("Type {0} is not supported", dt.Columns[i].DataType.Name));
    }                
}

【问题讨论】:

  • 它很丑,但它很有效——+1 只是因为有一些有效的东西!

标签: c# datatable casting ienumerable r.net


【解决方案1】:
public DataFrame DataTableToDataFrame(string name, DataTable dt)
{
    DataFrame dataFrame = null;

    IEnumerable[] columns = new IEnumerable[dt.Columns.Count];
    string[] columnNames = dt.Columns.Cast<DataColumn>()
                           .Select(x => x.ColumnName)
                           .ToArray();

    for (int i = 0; i < dt.Columns.Count; i++)
    {
        switch (Type.GetTypeCode(dt.Columns[i].DataType))
        {
            case TypeCode.String:
                columns[i] = dt.Rows.Cast<DataRow>().Select(row => row.Field<string>(i)).ToArray();
                break;

            case TypeCode.Double:
                columns[i] = dt.Rows.Cast<DataRow>().Select(row => row.Field<double>(i)).ToArray();
                break;

            case TypeCode.Int32:
                columns[i] = dt.Rows.Cast<DataRow>().Select(row => row.Field<int>(i)).ToArray();
                break;

            case TypeCode.Int64:
            case TypeCode.Decimal:
                IEnumerable array = dt.Rows.Cast<DataRow>().Select(row => row.Field<object>(i)).ToArray();

                //columns[i] = dt.Rows.Cast<DataRow>().Select(row => row.Field<long>(i)).ToArray();
                //columns[i] = dt.Rows.Cast<DataRow>().Select(row => row.Field<decimal>(i)).ToArray();

                columns[i] = ListToIenumerable(array);
                break;

            default:
                columns[i] = dt.Rows.Cast<DataRow>().Select(row => row[i]).ToArray();
                //throw new InvalidOperationException(String.Format("Type {0} is not supported", dt.Columns[i].DataType.Name));
                break;
        }
    }

    dataFrame = REngine.CreateDataFrame(columns: columns, columnNames: columnNames, stringsAsFactors: false);
    REngine.SetSymbol(name, dataFrame);

    return dataFrame;
}

这是?

【讨论】:

【解决方案2】:

下面是如何使用动态类型和扩展方法将 DataTable 对象放入表示表列结构的对象的 List(实现 IEnumerable)中:

class Program
{
    static void Main()
    {
        var dt = new DataTable();
        //populate dt...

        List<dynamic> dataTableList= dt.DataTableToList();
    }
}

public static class DataTableExtensions
{
    public static List<dynamic> DataTableToList(this DataTable dt)
    {
        var list= new List<dynamic>();
        foreach (DataRow row in dt.Rows)
        {
            dynamic d = new ExpandoObject();
            list.Add(d);
            foreach (DataColumn column in dt.Columns)
            {
                var dic = (IDictionary<string, object>)d;
                dic[column.ColumnName] = row[column];
            }
        }

        return list;
    }
}

【讨论】:

  • 感谢您的回答,但此解决方案仅在数据表结构已知的情况下有效。我需要一个通用的解决方案,不知道数据表的结构。
  • 嗯,我现在明白了 - 我编辑了我的答案以展示如何实现这一点。
  • 请注意,动态类型列表与广告 IEnumerable[] 不同,例如数组数组。这是 RDotNet 所需要的
【解决方案3】:
public IEnumerable<int> ListToIenumerable(IEnumerable enumerable)
{
    List<int> list = new List<int>();

    foreach (object obj in enumerable)
    {
        list.Add(Convert.ToInt32(obj.ToString()));
    }

    IEnumerable<int> returnValue = list.ToArray();

    return returnValue;
}

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2011-03-24
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2011-08-16
    • 2010-10-08
    • 1970-01-01
    • 2016-12-25
    相关资源
    最近更新 更多