【问题标题】:Convert nested JSON to CSV将嵌套 JSON 转换为 CSV
【发布时间】:2015-10-06 03:02:51
【问题描述】:

我正在将一个超过 10 个级别的嵌套 JSON 对象转换为 C# .NET 中的 CSV 文件。

我一直在使用JavaScriptSerializer().Deserialize<ObjectA>(json)XmlNode xml = (XmlDocument)JsonConvert.DeserializeXmlNode(json) 来分解对象。有了这些对象,我可以进一步写入 CSV 文件。但是现在 JSON 对象进一步扩展。大多数数据并没有真正使用,所以我更喜欢原始数据转储。

我可以在不声明结构的情况下将数据转储为 csv 格式吗?

示例 JSON

{
"F1":1,
"F2":2,
"F3":[
    {
        "E1":3,
        "E2":4
    },
    {
        "E1":5,
        "E2":6
    },  
    {
        "E1":7,
        "E2":8,
        "E3":[
            {
                "D1":9,
                "D2":10
            }
        ]
    },      
]
}   

我预期的 CSV 输出是

F1,F2,E1,E2,D1,D2
1,2
1,2,3,4
1,2,5,6
1,2,7,8,9,10

【问题讨论】:

  • 你能不能把数据转换成数据表..然后从那里转换成CSV
  • @MethodMan 这正是我在处理前几个级别时所做的。毕竟我发现把它们放到结构中太费时间了,所以我正在检查是否有任何自动化。
  • 数据的来源是哪里......它是否存储在数据库中......?
  • @MethodMan 来自HttpContext,我已经把它变成了字符串。
  • 我已经尝试过,在这里查看:dotnetfiddle.net/7MBHUc 它不起作用,但我能够遍历。将尽可能下次更新。

标签: c# json csv


【解决方案1】:

您的请求存在不一致之处:您希望为具有子对象的根对象生成一行,但不希望为也具有子对象的"F3[2]" 对象生成一行。所以听起来你的规则是,“为具有至少一个原始值属性的对象打印一行,只要该对象是根对象或没有具有至少一个原始值属性的后代对象” .这有点棘手,但可以通过LINQ to JSON 完成

        var obj = JObject.Parse(json);

        // Collect column titles: all property names whose values are of type JValue, distinct, in order of encountering them.
        var values = obj.DescendantsAndSelf()
            .OfType<JProperty>()
            .Where(p => p.Value is JValue)
            .GroupBy(p => p.Name)
            .ToList();

        var columns = values.Select(g => g.Key).ToArray();

        // Filter JObjects that have child objects that have values.
        var parentsWithChildren = values.SelectMany(g => g).SelectMany(v => v.AncestorsAndSelf().OfType<JObject>().Skip(1)).ToHashSet();

        // Collect all data rows: for every object, go through the column titles and get the value of that property in the closest ancestor or self that has a value of that name.
        var rows = obj
            .DescendantsAndSelf()
            .OfType<JObject>()
            .Where(o => o.PropertyValues().OfType<JValue>().Any())
            .Where(o => o == obj || !parentsWithChildren.Contains(o)) // Show a row for the root object + objects that have no children.
            .Select(o => columns.Select(c => o.AncestorsAndSelf()
                .OfType<JObject>()
                .Select(parent => parent[c])
                .Where(v => v is JValue)
                .Select(v => (string)v)
                .FirstOrDefault())
                .Reverse() // Trim trailing nulls
                .SkipWhile(s => s == null)
                .Reverse());

        // Convert to CSV
        var csvRows = new[] { columns }.Concat(rows).Select(r => string.Join(",", r));
        var csv = string.Join("\n", csvRows);

        Console.WriteLine(csv);

使用

public static class EnumerableExtensions
{
    // http://stackoverflow.com/questions/3471899/how-to-convert-linq-results-to-hashset-or-hashedset
    public static HashSet<T> ToHashSet<T>(this IEnumerable<T> source)
    {
        return new HashSet<T>(source);
    }
}

哪些输出:

F1,F2,E1,E2,D1,D2
1,2
1,2,3,4
1,2,5,6
1,2,7,8,9,10

【讨论】:

  • 很遗憾这没有被标记为解决方案或至少没有评论。东西....如果我最终尝试一下,我会报告。
  • 出色的解决方案。
【解决方案2】:

我写了这个,它对我有用 在这里,我们将对象树的所有面包屑保存在标题中,格式为 prop_prop 并将 jarray 属性对象以 prop1 格式保存在标头中

    public Dictionary<string, string> ComplexJsonToDictionary(JObject jObject, Dictionary<string, string> result, string field)
    {
        foreach (var property in jObject.Properties())
        {
            var endField = field + (string.IsNullOrEmpty(field) ? "" : "_") + property.Name;

            var innerDictionary = new Dictionary<string, string>();
            try
            {
                var innerValue = JObject.Parse(Convert.ToString(property.Value));


                result.AddOrOverride(ComplexJsonToDictionary(innerValue, innerDictionary, endField));
            }
            catch (Exception)
            {
                try
                {
                    var innerValues = JArray.Parse(Convert.ToString(property.Value));
                    try
                    {
                        var i = 0;
                        foreach (var token in innerValues)
                        {
                            var innerValue = JObject.Parse(Convert.ToString(token));

                            result.AddOrOverride(ComplexJsonToDictionary(innerValue, innerDictionary, endField+i++));
                        }
                    }
                    catch (Exception)
                    {
                        result.Add(endField, string.Join(",", innerValues.Values<string>()));
                    }
                }
                catch (Exception)
                {
                    result.Add(endField, property.Value.ToString());
                }
            }
        }
        return result;
    }

感谢您的关注,如果合适,请写评论。

【讨论】:

    【解决方案3】:

    将 json 解析的 jobject 传递给静态扩展,然后它将返回带有 flatten 的 jobjects 数组,然后转换为 csv。部分源码取自其他栈溢出等资源,我没有参考。

     public static IEnumerable<string> JsonToCsvRowsWithHierarchyHeaders(string jsonData)
        {
            if (jsonData.Trim().StartsWith("[") && jsonData.Trim().EndsWith("]"))
            {
                var startString = "{\"appendRoot\":";
                var endString = "}";
                jsonData = $"{startString}{jsonData}{endString}";
            }
    
            var jObject = JsonConvert.DeserializeObject<JObject>(jsonData);
            var flattenJObjects = JsonParserExtensions.FlattenJsonGetJObjects(jObject).ToList();
      
            var csvRows = new List<string>();
            if (flattenJObjects.Any())
            {
                var firstRow = flattenJObjects.First().Children<JProperty>().Select(x => x.Name).ToList();
                var header = string.Join(delimeter, firstRow).Replace("appendRoot_", "").ToLower();
                csvRows.Add(header);
                foreach (var flattenJObject in flattenJObjects)
                {
                    var tokens = flattenJObject.Children<JProperty>();
    
                    if (firstRow.Count() != tokens.Count())
                    {
                        var missMatchPropertyValues = firstRow.Select(cell => tokens.FirstOrDefault(x => x.Name == cell))
                            .Select(value => value == null
                                ? string.Empty
                                : CheckAndUpdateRowCellValueTextQualifier(JsonConvert.DeserializeObject<string>(value.Value.ToString(Newtonsoft.Json.Formatting.None))))
                            .ToList();
                        var rowString = string.Join(delimeter, missMatchPropertyValues);
                        csvRows.Add(rowString);
                    }
                    else
                    {
                        var rowValue = tokens.Select(token =>
                            CheckAndUpdateRowCellValueTextQualifier(
                                JsonConvert.DeserializeObject<string>(
                                    token.Value.ToString(Newtonsoft.Json.Formatting.None))));
                        var rowString = string.Join(delimeter, rowValue);
                        csvRows.Add(rowString);
                    }
                }
            }
            return csvRows;
        }
    
     
        private static string CheckAndUpdateRowCellValueTextQualifier(string value)
        {
            const string q = @"""";
            if (!string.IsNullOrEmpty(value) && value.Contains('\"'))
            {
                return value;
            }
    
            if (!string.IsNullOrEmpty(value) && (value.Contains(',') ||
                                             value.Contains('"') ||
                                             value.Contains('\n') || value.Contains('\r')))
            {
                return $"{q}{value}{q}";
            }
            
            return value;
        }
    }
    
    public static class JsonParserExtensions
    {
        public static IEnumerable<JObject> FlattenJsonGetJObjects(JObject jObject, string parentName = null)
        {
            if (!(parentName is null))
                jObject = RenamePropertiesByHierarchyName(jObject, parentName);
    
            var fields = jObject.Properties().Where(p => p.Value.GetType().Name == "JValue").ToList();
            var objects = jObject.Properties().Where(p => p.Value.GetType().Name == "JObject").ToList();
            var arrays = jObject.Properties().Where(p => p.Value.GetType().Name == "JArray").ToList();
            var objectsArray = arrays.Where(array => array.Value.All(elements => elements.GetType().Name == "JObject")).ToList();
            var valuesArray = arrays.Where(array => array.Value.All(elements => elements.GetType().Name == "JValue")).ToList();
    
            var nestedObjects = ProcessNestedObjects(objects);
    
            var joinedInnerObjects = nestedObjects.Any()
                ? nestedObjects.Select(innerObject => JoinJObject(new JObject(fields), innerObject))
                : new List<JObject> { new JObject(fields) };
    
            var arraysObjectList = GetJObjectsFromArrayOfJProperties(objectsArray);
            var arraysValueList = GetJObjectsFromArrayOfValues(valuesArray);
            var joinedAll = joinedInnerObjects.SelectMany(inner => JoinMultipleJObjects(arraysObjectList, arraysValueList, inner));
            return joinedAll;
        }
    
        public static List<JObject> ProcessNestedObjects(List<JProperty> jObjects)
        {
            var processNestedObjects = new List<JObject>();
            var renamedJObjects = jObjects?.Select(obj => RenamePropertiesByHierarchyName(obj.Value.ToObject<JObject>(), obj.Name)).ToList();
    
            if (!(renamedJObjects?.Count > 0)) return processNestedObjects;
            var renamed = renamedJObjects.Aggregate((acc, next) => JoinJObject(acc, next));
    
            var nestedObjects = renamed.Properties().Where(p => p.Value.GetType().Name == "JObject").ToList();
            var nestedArrays = renamed.Properties().Where(p => p.Value.GetType().Name == "JArray").ToList();
            var nestedObjectsArray = nestedArrays.Where(array => array.Value.All(elements => elements.GetType().Name == "JObject")).ToList();
            var nestedValuesArray = nestedArrays.Where(array => array.Value.All(elements => elements.GetType().Name == "JValue")).ToList();
    
            nestedArrays.ForEach(p => renamed.Remove(p.Name));
            nestedObjects.ForEach(p => renamed.Remove(p.Name));
    
            var nestedObjectList = new List<JObject>();
            var nestedMultipleObjectList = new List<JObject>();
            foreach (var listJObjects in nestedObjects.Select(innerObject => FlattenJsonGetJObjects(innerObject.Value.ToObject<JObject>(), innerObject.Name)).ToList())
            {
                if (listJObjects.Count() > 1)
                    nestedMultipleObjectList.AddRange(listJObjects);
                else
                    nestedObjectList.Add(listJObjects.First());
            }
    
            var jObjectsFromArrayOfJProperties = GetJObjectsFromArrayOfJProperties(nestedObjectsArray);
            var jObjectsFromArrayOfValues = GetJObjectsFromArrayOfValues(nestedValuesArray);
    
            var aggregate = nestedObjectList.Aggregate(renamed, (acc, next) => JoinJObject(acc, next));
            var groupedNestedObjects = (nestedMultipleObjectList.Any()) ? nestedMultipleObjectList.Select(nested => JoinJObject(aggregate, nested))
                : new List<JObject> { aggregate };
            var groupedNestedObjectsList = groupedNestedObjects.Select(groupedNested => JoinMultipleJObjects(jObjectsFromArrayOfJProperties, jObjectsFromArrayOfValues, groupedNested));
            processNestedObjects.AddRange(groupedNestedObjectsList.SelectMany(e => e));
            return processNestedObjects;
        }
    
        public static List<JObject> JoinMultipleJObjects(List<JObject> nestedArraysObjectList, List<JObject> nestedArraysValueList, JObject groupedNestedObjects)
        {
            var result = new List<JObject>();
            var joined = new List<JObject>();
            if (!nestedArraysObjectList.Any())
                joined.Add(groupedNestedObjects);
            else
                nestedArraysObjectList.ForEach(e => joined.Add(JoinJObject(groupedNestedObjects, e)));
    
            result.AddRange(nestedArraysValueList.Any()
                ? nestedArraysValueList
                    .SelectMany(value => joined, (value, joinedItem) => JoinJObject(joinedItem, value)).ToList()
                : joined);
            return result;
        }
    
        public static List<JObject> GetJObjectsFromArrayOfJProperties(List<JProperty> nestedJProperties)
        {
            var fromArrayOfJProperties = new List<JObject>();
            foreach (var jProperty in nestedJProperties)
            {
                var nestedArraysObjectList = new List<JObject>();
                var name = jProperty.Name;
                var jPropertyValue = jProperty.Value;
                var renamedObjects = jPropertyValue?.Select(obj => RenamePropertiesByHierarchyName(obj.ToObject<JObject>(), name)).ToList();
                foreach (var jObjects in renamedObjects.Select(innerObject => FlattenJsonGetJObjects(innerObject.ToObject<JObject>())))
                {
                    nestedArraysObjectList.AddRange(jObjects);
                }
    
                if (fromArrayOfJProperties.Any() && nestedArraysObjectList.Any())
                    fromArrayOfJProperties = nestedArraysObjectList
                        .SelectMany(nested => fromArrayOfJProperties, (current, joined) => JoinJObject(joined, current)).ToList();
    
                if (!fromArrayOfJProperties.Any())
                    fromArrayOfJProperties.AddRange(nestedArraysObjectList);
            }
            return fromArrayOfJProperties;
        }
    
        public static List<JObject> GetJObjectsFromArrayOfValues(List<JProperty> nestedValuesArray)
        {
            return (from innerArray in nestedValuesArray let name = innerArray.Name let values = innerArray.Value from innerValue in values select new JObject(new JProperty(name, innerValue.ToObject<JValue>()))).ToList();
        }
    
        public static JObject RenamePropertiesByHierarchyName(JObject jObject, string hierarchyName)
        {
            var properties = jObject.Properties().ToList().Select(p => new JProperty($"{hierarchyName}_{p.Name}", p.Value));
            return new JObject(properties);
        }
    
        public static JObject JoinJObject(JObject parentJObject, JObject innerObject)
        {
            var joinJObject = new JObject
                {
                    parentJObject.Properties(),
                    innerObject.Properties()
                };
            return joinJObject;
        }
    }
    

    【讨论】:

    • 如果您有新问题,请点击 按钮提出问题。如果有助于提供上下文,请包含指向此问题的链接。 - From Review
    猜你喜欢
    • 2018-01-07
    • 2020-04-02
    • 2020-10-28
    • 2022-01-07
    • 2018-10-27
    • 2021-05-17
    • 2021-01-08
    • 1970-01-01
    相关资源
    最近更新 更多