【问题标题】:Powershell- Convert Complex XML to CSVPowershell - 将复杂的 XML 转换为 CSV
【发布时间】:2019-12-11 16:37:36
【问题描述】:

我想将下面的 XML 转换为 CSV。尽管它在 XML 文件中有标题行信息。

<?xml version="1.0" encoding="UTF-8"?>
<myfile>
  <updatetime>2019-07-30 08:30:30</updatetime>
  <process code="PRS1234" name="PROCESS1234" />
  <equipment code="EQP1234" name="EQUIPMENT1234" />
  <product type="equipment" planned="300" time="36000" cycletime="20" />
  <shift code="1" timestart="2019-07-30 02:00:00">
    <order index="1" goodproduct="500" defectproduct="5" time="2019-07-30 02:00:00" />
    <order index="2" goodproduct="980" defectproduct="7" time="2019-07-30 03:00:00" />
    <order index="3" goodproduct="1200" defectproduct="12" time="2019-07-30 04:00:00" />
    <order index="4" goodproduct="1800" defectproduct="15" time="2019-07-30 05:00:00" />
    <order index="5" goodproduct="2500" defectproduct="15" time="2019-07-30 06:00:00" />
  <shift>
  <shift code="2" timestart="2019-07-30 07:00:00">
    <order index="1" goodproduct="600" defectproduct="5" time="2019-07-30 07:00:00" />
    <order index="2" goodproduct="980" defectproduct="7" time="2019-07-30 08:00:00" />
    <order index="3" goodproduct="1500" defectproduct="8" time="2019-07-30 09:00:00" />
    <order index="4" goodproduct="1700" defectproduct="11" time="2019-07-30 10:00:00" />
    <order index="5" goodproduct="3000" defectproduct="15" time="2019-07-30 11:00:00" />
  </shift>
</myfile>

我可以获得所需节点的值。这就是我所做的。

[xml]$inputFile = Get-Content "Q:\XML\FileComplex.xml"
$inputFile.myfile.product | Select-Object -Property type,planned,time,cycletime | ConvertTo-Csv -NoTypeInformation -Delimiter ";" | Set-Content -Path "Q:\XML\FileComplex.csv" -Encoding UTF8 "

我想要实现的是将所有需要的信息组合在一起,并将其作为 CSV 文件的一条记录,就像这样

updatetime          | code(process) | name(process) | code(equipment) | name(equipment) | type(product) | planned(product) | time(product) | cycletime(product) | goodproduct(shift(code) is 1 and index is max) | defectproduct(shift(code) is 1 and index is max) | goodproduct(shift(code) is 2 and index is max) | defectproduct((shift(code) is 2 where index is max)
2019-07-30 08:30:30 | PRS1234       | PROCESS1234   | EQP1234         | EQUIPMENT1234   | equipment     | 300              | 36000         | 20                 |                               2500             |                                               15 |                                           3000 |                                       15

非常感谢您的支持!!

提前致谢 娜塔莎

【问题讨论】:

  • 请发布您的实际 XML 示例,或修复当前示例中的错误,使其真正可加载。
  • 抱歉,我修复了它们@Tomalak

标签: xml powershell csv


【解决方案1】:

OP 的 XML 似乎无效。假设 XML 是……

<?xml version="1.0" encoding="UTF-8"?>
<myfile>
    <updatetime>2019-07-30 08:30:30</updatetime>
    <process code="PRS1234" name="PROCESS1234" />
    <equipment code="EQP1234" name="EQUIPMENT1234" />
    <product type="equipment" planned="300" time="36000" cycletime="20" />
    <shift code="1" timestart="2019-07-30 02:00:00">
        <order index="1" goodproduct="500" defectproduct="5" time="2019-07-30 02:00:00" />
        <order index="2" goodproduct="980" defectproduct="7" time="2019-07-30 03:00:00" />
        <order index="3" goodproduct="1200" defectproduct="12" time="2019-07-30 04:00:00" />
        <order index="4" goodproduct="1800" defectproduct="15" time="2019-07-30 05:00:00" />
        <order index="5" goodproduct="2500" defectproduct="15" time="2019-07-30 06:00:00" />
    </shift>
    <shift code="2" timestart="2019-07-30 07:00:00">
        <order index="1" goodproduct="600" defectproduct="5" time="2019-07-30 07:00:00" />
        <order index="2" goodproduct="980" defectproduct="7" time="2019-07-30 08:00:00" />
        <order index="3" goodproduct="1500" defectproduct="8" time="2019-07-30 09:00:00" />
        <order index="4" goodproduct="1700" defectproduct="11" time="2019-07-30 10:00:00" />
        <order index="5" goodproduct="3000" defectproduct="15" time="2019-07-30 11:00:00" />
    </shift>
</myfile>

由于 PowerShell 的 XPath 1.0 不支持 max() 函数,以下代码假定 order 元素按 index 升序排列,只选择最后一个。如果您不能保证index 订单,您需要提出自己的max() 解决方案...

[xml]$xml = Get-Content "FileComplex.xml"

# PowerShell uses XPath 1.0, which doesn't support max().
# If it did, you could select: //shift[code=1]/order[@id = max(//shift[code=1]/order/@id)]
$order1 = $xml.SelectSingleNode("//shift[@code=1]/order[last()]")
$order2 = $xml.SelectSingleNode("//shift[@code=2]/order[last()]")

# Using Add-Member like this is messy, but guarantees the order of fields on the $result object...
$result = New-Object PSObject -Property $props
$result | Add-Member NoteProperty "updatetime" $xml.SelectSingleNode("//updatetime").InnerText
$result | Add-Member NoteProperty "code(process)" $xml.SelectSingleNode("//process").code
$result | Add-Member NoteProperty "name(process)" $xml.SelectSingleNode("//process").name
$result | Add-Member NoteProperty "code(equipment)" $xml.SelectSingleNode("//equipment").code
$result | Add-Member NoteProperty "name(equipment)" $xml.SelectSingleNode("//equipment").name
$result | Add-Member NoteProperty "type(product)" $xml.SelectSingleNode("//product").type
$result | Add-Member NoteProperty "planned(product)" $xml.SelectSingleNode("//product").planned
$result | Add-Member NoteProperty "time(product)" $xml.SelectSingleNode("//product").time
$result | Add-Member NoteProperty "cycletime(product)" $xml.SelectSingleNode("//product").cycletime
$result | Add-Member NoteProperty "goodproduct(shift(code) is 1 and index is max)" $order1.goodproduct
$result | Add-Member NoteProperty "defectproduct(shift(code) is 1 and index is max)" $order1.defectproduct
$result | Add-Member NoteProperty "goodproduct(shift(code) is 2 and index is max)" $order2.goodproduct
$result | Add-Member NoteProperty "defectproduct(shift(code) is 2 and index is max)" $order2.defectproduct

$result | Export-Csv -Path "FileComplex.csv" -Delimiter ';' -Encoding utf8

【讨论】:

  • 我认为这不满足“索引为最大值”条件...?不过,它可以使用 XPath 1.0 来完成。
  • 你是对的,它不是 max()。我在回答中也说了这么多。
  • ($xml.SelectNodes("//shift[@code=1]/order") | Sort index | Select -Last 1) 将是替代方案之一。
  • 是的,这完全可以。我一直不明白为什么微软在他们的 XMLDocument 实现中没有超越 XPath 1.0。这真的非常有限。
  • 我也没有。在某个时候,他们干脆停止了对 XML 的投资。
【解决方案2】:

我会使用 SelectSingleNode() 和 XPath 来解决这个问题。

$data = New-Object xml;
$data.load(".\myfile.xml")

$record = [pscustomobject]@{
    "updatetime"             = $data.SelectSingleNode("/*/updatetime")."#text"
    "code(process)"          = $data.SelectSingleNode("/*/process").code
    "name(process)"          = $data.SelectSingleNode("/*/process").name
    "code(equipment)"        = $data.SelectSingleNode("/*/equipment").code
    "name(equipment)"        = $data.SelectSingleNode("/*/equipment").name
    "type(product)"          = $data.SelectSingleNode("/*/product").type
    "planned(product)"       = $data.SelectSingleNode("/*/product").planned
    "time(product)"          = $data.SelectSingleNode("/*/product").time
    "cycletime(product)"     = $data.SelectSingleNode("/*/product").cycletime
    "goodproduct(shift 1)"   = $data.SelectSingleNode("/*/shift[@code = 1]/order[not(@index < ../order/@index)]").goodproduct
    "defectproduct(shift 1)" = $data.SelectSingleNode("/*/shift[@code = 1]/order[not(@index < ../order/@index)]").defectproduct
    "goodproduct(shift 2)"   = $data.SelectSingleNode("/*/shift[@code = 2]/order[not(@index < ../order/@index)]").goodproduct
    "defectproduct(shift 2)" = $data.SelectSingleNode("/*/shift[@code = 2]/order[not(@index < ../order/@index)]").defectproduct
}

$record
#$record | ConvertTo-Csv -NoTypeInformation

order[not(@index &lt; ../order/@index)] 的解释:“任何&lt;order&gt;index 不小于它旁边的任何其他&lt;order&gt;index。” - 唯一的&lt;order&gt;此条件为真的条件为最大index

输出是这样的(转换为CSV之前)

更新时间:2019-07-30 08:30:30 代码(过程):PRS1234 名称(进程):PROCESS1234 代号(设备):EQP1234 名称(设备):EQUIPMENT1234 类型(产品):设备 计划(产品):300 时间(产品):36000 周期时间(产品):20 好产品(班次 1):2500 缺陷产品(班次 1):15 好产品(班次 2):3000 缺陷产品(班次 2):15

【讨论】:

  • 还有一个问题 你能告诉我如何将此文件另存为--> date(updatetime)-code(equipment).csv 类似于 2019-07-30-EQP1234.csv跨度>
  • 您可以在我的代码中看到updatetimeequipment 是如何从XML 中选择出来的。您应该能够从这两件事中构建文件名。使用 Export-Csv cmdlet 写入 CSV 文件。
【解决方案3】:

这就是我最终得到的

$data = New-Object xml;
$data.load("Q:\XML\Myfile.xml")
$record = [pscustomobject]@{
"updatetime"             = $data.SelectSingleNode("/*/updatetime")."#text"
"code(process)"          = $data.SelectSingleNode("/*/process").code
"name(process)"          = $data.SelectSingleNode("/*/process").name
"code(equipment)"        = $data.SelectSingleNode("/*/equipment").code
"name(equipment)"        = $data.SelectSingleNode("/*/equipment").name
"type(product)"          = $data.SelectSingleNode("/*/product").type
"planned(product)"       = $data.SelectSingleNode("/*/product").planned
"time(product)"          = $data.SelectSingleNode("/*/product").time
"cycletime(product)"     = $data.SelectSingleNode("/*/product").cycletime
"goodproduct(shift 1)"   = $data.SelectSingleNode("/*/shift[@code = 1]/order[not(@index < ../order/@index)]").goodproduct
"defectproduct(shift 1)" = $data.SelectSingleNode("/*/shift[@code = 1]/order[not(@index < ../order/@index)]").defectproduct
"goodproduct(shift 2)"   = $data.SelectSingleNode("/*/shift[@code = 2]/order[not(@index < ../order/@index)]").goodproduct
"defectproduct(shift 2)" = $data.SelectSingleNode("/*/shift[@code = 2]/order[not(@index < ../order/@index)]").defectproduct
}
$equipment = $record.'code(equipment)'
$date = Get-Date -Date $record.updatetime
$dateformat = $date.ToString("yyyy-MM-dd")
$record | ConvertTo-Csv -NoTypeInformation -Delimiter ',' | Set-Content -Path "Q:\XML\$( "$equipment" + "_" + "$dateformat").csv" -Encoding utf8

【讨论】:

  • 感谢分享您的最终解决方案!
猜你喜欢
  • 1970-01-01
  • 2020-12-09
  • 2014-08-07
  • 2017-09-09
  • 2021-01-01
  • 2020-06-10
  • 1970-01-01
  • 1970-01-01
  • 2020-06-10
相关资源
最近更新 更多