【问题标题】:Segregate xml contents based on node value using C#使用 C# 根据节点值分离 xml 内容
【发布时间】:2016-06-10 06:48:52
【问题描述】:

我有一个如下所示的 xml。我想提取在每个 #NEWPAGE# 标记之间找到的所有 xml 标记,并将它们分别保存到 sql server 数据库中。请提出一种方法。

最初,我有一个包含以下详细信息的 txt 文件,但我想将文件的每一行转换为 xml 标记行。现在,我有一个 xml 文件。我无法根据节点值#NEWPAGE# 提取部分 xml。


XML内容如下:

<?xml version="1.0" encoding="utf-8"?>
    <root>
      <Line>#HEADINGBEGIN#</Line>
       <Line></Line>
      <Line>Employee: 16062      Name: MERZLAK,BRIAN         Base: MSP  Eqpt: E70  Pos: CA</Line>
      <Line></Line>
      <Line>       Daily    On    Off          Daily  Daily                     Jr    Accum</Line>
      <Line>Date   Assign  Duty   Duty   TAFB  Block  Credit   Trip Guarantee   Man  Credit</Line>
      <Line>-----  ------ -----  -----  -----  -----  ------  ---------------  ----  ------</Line>
      <Line>#HEADINGEND#</Line>
      <Line>11/01  M2100A  0:01          0:00   4:35   0:00    0:00            0:00    0:00 </Line>
      <Line>11/02    "                   0:00   7:17   0:00    0:00            0:00    0:00 </Line>
      <Line>11/03    "           19:12  67:12   6:51  20:14    0:00            0:00   20:14 </Line>
      <Line>#GROUPNOBREAK#</Line>
      <Line>#GROUPBEGIN#</Line>
      <Line></Line>
      <Line>             Taxable TAFB    0:00                                              </Line>
      <Line>         Non-Taxable TAFB  178:00                                              </Line>
      <Line>               Total TAFB  178:00                                              </Line>
      <Line>#GROUPEND#</Line>
      <Line>#NEWPAGE#</Line>
      <Line>#HEADINGBEGIN#</Line>
      <Line></Line>
      <Line>Employee: 19814      Name: GRAYSON,MONIQUE       Base: LAX  Eqpt: E70  Pos: CA</Line>
      <Line></Line>
      <Line>       Daily    On    Off          Daily  Daily                     Jr    Accum</Line>
      <Line>Date   Assign  Duty   Duty   TAFB  Block  Credit   Trip Guarantee   Man  Credit</Line>
      <Line>-----  ------ -----  -----  -----  -----  ------  ---------------  ----  ------</Line>
      <Line>#HEADINGEND#</Line>
      <Line>11/01  OFF                   0:00   0:00   0:00    0:00            0:00    0:00 </Line>
      <Line>11/02  OFF                   0:00   0:00   0:00    0:00            0:00    0:00 </Line>
      <Line>11/03  L2488  13:30          0:00   7:10   0:00    0:00            0:00    0:00 </Line>
      <Line>11/04    "                   0:00   4:25   0:00    0:00            0:00    0:00 </Line>
      <Line>#GROUPNOBREAK#</Line>
      <Line>#GROUPBEGIN#</Line>
      <Line></Line>
      <Line>             Taxable TAFB    0:00                              Over Guar: 17:08</Line>
      <Line>         Non-Taxable TAFB  327:29                                              </Line>
      <Line>               Total TAFB  327:29                                              </Line>
      <Line>#GROUPEND#</Line>
      <Line>#NEWPAGE#</Line>
      <Line>#HEADINGBEGIN#</Line>
      <Line></Line>
      <Line>Employee: 20730      Name: ZAHN,GEOFFREY         Base: SEA  Eqpt: E70  Pos: CA</Line>
      <Line></Line>
      <Line>       Daily    On    Off          Daily  Daily                     Jr    Accum</Line>
      <Line>Date   Assign  Duty   Duty   TAFB  Block  Credit   Trip Guarantee   Man  Credit</Line>
      <Line>-----  ------ -----  -----  -----  -----  ------  ---------------  ----  ------</Line>
      <Line>#HEADINGEND#</Line>
      <Line>11/01  OFF                   0:00   0:00   0:00    0:00            0:00    0:00 </Line>
      <Line>11/02  OFF                   0:00   0:00   0:00    0:00            0:00    0:00 </Line>
      <Line>11/03  S2088  10:02          0:00   6:47   0:00    0:00            0:00    0:00 </Line>

      <Line>#GROUPNOBREAK#</Line>
      <Line>#GROUPBEGIN#</Line>
      <Line></Line>
      <Line>             Taxable TAFB    9:25                              Over Guar:  0:53</Line>
      <Line>         Non-Taxable TAFB  122:30                                              </Line>
      <Line>               Total TAFB  131:55                                              </Line>
    <Line>#GROUPEND#</Line>
    </root>

【问题讨论】:

  • 太好了,罗伊先生。但我们不是免费编写代码。向我们展示您的研究成果以及您迄今为止编写的代码。
  • SQL Server 的哪个版本?

标签: c# sql-server xml


【解决方案1】:

正如你所说,你也有文本文件。那你也可以用这个简单的方法-->

static void Main()
        {
            string filePath = @"C:\yourTextFile.txt";
            string input = File.ReadAllText(filePath);
            string pattern = @"#HEADINGBEGIN#.*?#GROUPEND#";
            var matches = Regex.Matches(input, pattern, RegexOptions.Singleline);
            List<string> list = new List<string>();            
            foreach (var v in matches)
            {
                list.Add(v.ToString());
            }
            // Now save this list where ever you want.
        }

这提供了在 #HEADINGBEGIN##GROUPEND# 之间找到的所有员工数据,并用 #NEWPAGE# 分隔。

【讨论】:

    【解决方案2】:

    您可以使用LinqXml 来实现这一点。

    XDocument doc = XDocument.Load(filepath);
    
    var result = doc.Descendants("Line")                      // Get all descendants of Line
            .SkipWhile(x=> (string)x.Value == "#NEWPAGE#")    // Skip Lines till we found tag.
            .TakeWhile(x=>(string)x.Value != "#NEWPAGE#")     // Take lines until we found other tag.
            .ToList();
    
    // Write to file
    File.WriteAllLines(newfile,result.Select(x=>x.ToString()); // TODO : Provide filename
    

    查看Demo

    【讨论】:

      【解决方案3】:

      整个方法看起来很弱......我确信有一个更好的概念可以做到这一点。但要回答你的问题:你可以给你的行编号,找到#NEWPAGE# 标签并使用它们来分割你的结果集:

      注意:这里使用LEAD,自 SQL Server 2012 起可用

      更新我会将 XML 作为参数传递给存储过程按原样并在那里进行解析...

      DECLARE @xml XML='Your XML here';
      WITH AllLines AS
      (
          SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS RowNr
                ,Line.value('.','nvarchar(max)') AS Content
          FROM @xml.nodes('root/Line') AS One(Line)
      )
      ,NewPages AS
      (
          SELECT 0 AS NewpageStart --the very first line has no #NEWPAGE#...
          UNION ALL
          SELECT RowNr FROM AllLines WHERE Content='#NEWPAGE#'
          UNION ALL 
          SELECT 999999 --Needs a final mark too...
      )
      ,PageBorders AS
      (
          SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS PageNr
                ,NewpageStart+1 AS NewPageStart
                ,LEAD(NewPageStart) OVER(ORDER BY NewPageStart)-1  AS NewPageEnd
          FROM NewPages
      )
      SELECT PageNr
            ,ROW_NUMBER() OVER(PARTITION BY PageNr ORDER BY RowNr) AS PageRowNr
            ,AllLines.*
      FROM PageBorders
      INNER JOIN AllLines ON AllLines.RowNr BETWEEN PageBorders.NewPageStart AND PageBorders.NewPageEnd
      

      【讨论】:

        【解决方案4】:

        你可以从这个开始

        using System;
        using System.Collections.Generic;
        using System.Linq;
        using System.Text;
        using System.Xml;
        using System.Xml.Linq;
        using System.Text.RegularExpressions;
        
        namespace ConsoleApplication102
        {
            class Program
            {
                enum State
                {
                    FIND_HEADINGBEGIN,
                    HEADINGBEGIN,
                    EMPLOYEE,
                    FIND_GROUPBEGIN,
                    GROUP
                }
                const string FILENAME = @"c:\temp\test.xml";
                static void Main(string[] args)
                {
                    List<Employee> employees = ParseXml(FILENAME);
                }
                static List<Employee> ParseXml(string filename)
                {
                    string employeePattern = @"Employee:\s*(?'employee'\d*)\s+Name:\s*(?'name'[^\s]*)\s*Base:\s+(?'base'[^\s]*)\s*Eqpt:\s+(?'eqpt'[^\s]*)\s+Pos:\s+(?'pos'[^\s]*)";
                    List<Employee> employees = new List<Employee>();
                    Employee newEmployee = null;
                    List<int> assmentColumnWidths = new List<int>() {7, 7, 14, 7, 7, 8, 16, 8, 8};
                    int lineNo = 0;
                    State state = State.FIND_HEADINGBEGIN;
                    XDocument doc = XDocument.Load(FILENAME);
                    foreach (XElement xLine in doc.Descendants("Line"))
                    {
                        string line = ((string)xLine).Trim();
                        if (line.Length > 0)
                        {
                            switch (state)
                            {
                                case State.FIND_HEADINGBEGIN:
                                    if (line.StartsWith("#HEADINGBEGIN#"))
                                    {
                                        state = State.HEADINGBEGIN;
                                        lineNo = 0;
                                    }
                                    break;
                                case State.HEADINGBEGIN:
                                    if (line.StartsWith("#HEADINGEND#"))
                                    {
                                        state = State.EMPLOYEE;
                                    }
                                    else
                                    {
                                        if (lineNo++ == 0)
                                        {
                                            newEmployee = new Employee();
                                            employees.Add(newEmployee);
                                            Match expr = Regex.Match(line, employeePattern);
                                            newEmployee.id = expr.Groups["employee"].Value;
                                            newEmployee.name = expr.Groups["name"].Value;
                                            newEmployee._base = expr.Groups["base"].Value;
                                            newEmployee.eqpt = expr.Groups["eqpt"].Value;
                                            newEmployee.pos = expr.Groups["pos"].Value;
                                            newEmployee.eqpt = expr.Groups["eqpt"].Value;
                                        }
                                    }
                                    break;
                                case State.EMPLOYEE:
                                    if (line.StartsWith("#GROUPNOBREAK#"))
                                    {
                                        state = State.FIND_GROUPBEGIN;
                                        lineNo = 0;
                                    }
                                    else
                                    {
                                        List<string> assignmentData = GetFixedWidth(line, assmentColumnWidths);
                                        Assignment assignment = new Assignment();
                                        if (newEmployee.assignments == null) newEmployee.assignments = new List<Assignment>();
                                        newEmployee.assignments.Add(assignment);
                                        assignment.date = assignmentData[0];
                                        assignment.name = (assignmentData[1] == "\"") ? newEmployee.assignments[newEmployee.assignments.Count - 2].name : assignmentData[1];
                                        assignment.onDuty = assignmentData[2];
                                        assignment.offDuty = assignmentData[3];
                                        assignment.tafb = assignmentData[4];
                                        assignment.dailyBlock = assignmentData[5];
                                        assignment.dailyCredit = assignmentData[6];
                                        assignment.tripGuarantee = assignmentData[7];
                                        assignment.jrMan = assignmentData[8];
        
                                    }
                                    break;
                                case State.FIND_GROUPBEGIN:
                                    if (line.StartsWith("#GROUPBEGIN#"))
                                    {
                                        state = State.GROUP;
                                        Total total = new Total();
                                        newEmployee.total = new Total();
                                    }
                                    break;
                                case State.GROUP:
                                    if (line.StartsWith("#GROUPEND#"))
                                    {
                                        state = State.FIND_HEADINGBEGIN;
                                    }
                                    else
                                    {
                                        string[] splitLine = line.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
                                        switch (++lineNo)
                                        {
                                            case 1 :
                                                newEmployee.total.taxable = splitLine[2];
                                                break;
                                            case 2:
                                                newEmployee.total.nonTaxable = splitLine[2];
                                                break;
                                            case 3:
                                                newEmployee.total.total = splitLine[2];
                                                break;
                                        }
        
                                    }
                                    break;
        
                            }
        
                        }
                    }
                    return employees;
                }
                static List<string> GetFixedWidth(string input, List<int> columns)
                {
                    int index = 0;
                    List<string> output = new List<string>();
                    for (int startPos = 0; (startPos < input.Length) && (index < columns.Count); startPos += columns[index])
                    {
                        if (startPos + columns[index] <= input.Length)
                        {
                            output.Add(input.Substring(startPos, columns[index++]).Trim());
                        }
                        else
                        {
                            output.Add(input.Substring(startPos).Trim());
                        }
        
                    }
                    return output;
                }
            }
            public class Employee
            {
                public string id { get; set; }
                public string name { get; set; }
                public string _base { get; set; }
                public string eqpt { get; set; }
                public string pos { get; set; }
                public List<Assignment> assignments { get; set; }
                public Total total { get; set; }
        
            }
            public class Assignment
            {
                public string date { get; set; }
                public string name { get; set; }
                public string onDuty { get; set; }
                public string offDuty { get; set; }
                public string tafb { get; set; }
                public string dailyBlock { get; set; }
                public string dailyCredit { get; set; }
                public string tripGuarantee { get; set; }
                public string jrMan { get; set; }
            }
            public class Total
            {
                public string taxable { get; set; }
                public string nonTaxable { get; set; }
                public string total { get; set; }
            }
        }
        

        【讨论】:

          猜你喜欢
          • 2013-11-26
          • 1970-01-01
          • 2017-01-27
          • 2016-11-23
          • 2020-02-12
          • 1970-01-01
          • 1970-01-01
          • 2023-03-09
          • 2021-04-23
          相关资源
          最近更新 更多