【问题标题】:Powershell: Advanced insert into XML files only if VATMODE tag is missingPowershell:仅在缺少 VATMODE 标记时才高级插入 XML 文件
【发布时间】:2017-08-25 01:52:54
【问题描述】:

完整解决方案

# Description: Adds <VATMODE>X</VATMODE> XML tags to files arriving from server, underneath each RECORD CODE line.
# Script tested and works using:
#   - Powershell v5.1 on Windows 10 Pro
#   - Powershell v4.0 on Windows Server 2008 R2.
#   - Does NOT work on Powershell v2.0

# References
# My own question: https://stackoverflow.com/questions/45639945/powershell-advanced-insert-into-xml-files-only-if-vatmode-tag-is-missing
# https://stackoverflow.com/questions/31678072/insert-content-into-specific-place-in-text-file-in-powershell
# https://stackoverflow.com/questions/1875617/insert-content-into-text-file-in-powershell
# https://social.technet.microsoft.com/wiki/contents/articles/4310.powershell-working-with-regular-expressions-regex.aspx
# http://blog.danskingdom.com/fix-problem-where-windows-powershell-cannot-run-script-whose-path-contains-spaces/
# https://community.spiceworks.com/topic/857690-automatically-and-silently-bypass-execution-policy-for-a-powershell-script
# http://leelusoft.blogspot.com.ng/p/watch-4-folder-25.html
# References

# Assign the directory where the XML files arrive from the server
$xmlFilesLocation = "C:\XML_dumping\"

# Change directory. Without this, the script will run in the same directory that the script is located at, and that's wrong
cd $xmlFilesLocation

# Show the directory so we can easily look at what's going on. Comment this out if it becomes annoying.
Invoke-Item $xmlFilesLocation

# Regular expression to match RECORD CODE lines
$regEx = "(\W\w{6}\s\w{4}\W.+)"

# A String variable which contains the VATMODE XML tag
$vatModeExists = "<VATMODE>X</VATMODE>"

# Assign the VATMODE tag, preceding it with three tabs for proper indentation
$vatModeTag = "`t`t`t<VATMODE>X</VATMODE>"

# Get all XML file names in the directory
$files = Get-ChildItem -Path $xmlFilesLocation -Filter *.xml

# Count the number of all XML files in the directory
$numberOfFiles = (Get-ChildItem -Path $xmlFilesLocation -Filter *.xml | Measure-Object).Count

# First, loop through all files separately to check if <VATMODE>X</VATMODE> exists, and skip if true
for ($i=1; $i -le $numberOfFiles; $i++) {

    # Scan the contents of each file
    $content = (Get-Content $files[$i - 1] -raw)

    # If <VATMODE>X</VATMODE> is detected in the file...
    if ($content -match $vatModeExists) {
        # ...then do not process the file (skip it)
        break
    }
}

# Then, loop through all files (again) separately to check if <VATMODE>X</VATMODE> is missing, and process if true
for ($j=1; $j -le $numberOfFiles; $j++) {

    # Scan the contents of each file
    $content = (Get-Content $files[$j - 1] -raw)

    # If <VATMODE>X</VATMODE> is missing in the file...
    if ($content -notmatch $vatModeExists) {

        # ...then replace in $content the regular expression with $vatModeTag and insert it directly underneath RECORD CODE line
        $content= [regex]::replace($content, $regEx, ('$1'+"`n"+"$vatModeTag"))

        # Save the file that now has the new $vatModeTag and output it
        $content | Out-File -encoding utf8 $files[$j - 1]
    }
}

问题陈述

我正在尝试实现类似于this 的目标,但增加了复杂性。这些是每天从服务器到达的 XML 文件,它们被放入单个文件夹中以导入会计系统。除非每个 RECORD CODE 父级下有子级 &lt;VATMODE&gt;X&lt;/VATMODE&gt;,否则会计系统不会导入文件。这些 XML 文件有两种可能到达:一个接一个,或者分批。它们具有不同的名称,具有不断递增的数字和不同的前缀。例如:NX1000060.xmlNX1000061.xmlABN000028.xml等。

Powershell 脚本

# Regex to match RECORD CODE lines
$regEx = "\W\w{6}\s\w{4}\W.+"

#Regex to match exactly <VATMODE>X</VATMODE>
$vatModeExists = "\W\w{7}.\w\W{2}\w{7}."

# Assign the VATMODE tag, preceding it with three tabs for proper indentation
$vatModeTag = "`t`t`t<VATMODE>X</VATMODE>"

# Get all XML files in the directory
$files = Get-ChildItem -Path "C:\XML_dumping" -Filter *.xml

# Get the number of XML files in the directory
$numberOfFiles = (Get-ChildItem -Path "C:\XML_dumping" -Filter *.xml | Measure-Object).Count

for ($i=1; $i -lt $numberOfFiles; $i++) { # Loop through each file separately
    $content = (Get-Content $files[$i - 1]) # Scan the contents of each file
    if ($content -match $vatModeExists) { # If <VATMODE>X</VATMODE> is detected in the file...
        break # ...then do not process the file (skip it)
    }

    # Get the matched RECORD CODE lines
    $found = $content -match $regEx
    for ($j=0; $j -lt $found.Length; $j++ ) { # Loop through each matched RECORD CODE line
        echo  $found[$j] $vatModeTag # Insert <VATMODE>X</VATMODE> right under RECORD CODE line
        # save the files that now have VATMODE inserted into them, but how?
    }
}

上面的脚本应该在每个 RECORD CODE 行下附加 VATMODE 标签,如下面的输出所示。

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<EXPORT>
    <IMPORTMODEL>NEX</IMPORTMODEL>
    <SESSION>1000060</SESSION>
    <CUSTORDERS>
        <RECORD CODE="NX0100096">
        <VATMODE>X</VATMODE>
        <INPUTDATE>19/07/2017</INPUTDATE>
        <!--...and so on...-->

在 Powershell ISE 中,脚本在 echo 中运行良好(用于目视检查),但如何插入 VATMODE 并保存已添加 VATMODE 的文件?

伪代码

  1. 分配正则表达式
  2. 分配 VATMODE 标记
  3. 获取文件列表
  4. 获取文件数
  5. 分别获取每个文件的内容
  6. 检查 VATMODE 是否已经存在并中断
  7. 否则附加 VATMODE
  8. 保存获得新 VATMODE 的文件

【问题讨论】:

  • 所以看你拥有的东西对我来说没有多大意义。 1. 获取所有 XML 文件 2. 获取所有这些文件的内容 3. 循环每个文件 4. 尝试显示内容? 5.将所有3个文件的所有内容都保存回每个文件中?
  • @ArcSet 确实是这样,而且似乎是错误的。我需要的是: 1. 取第一个文件。 2. 读取它并查看 VATMODE 标签是否存在。 3a。如果存在,请打破。 3b。否则,附加 VATMODE 标记。 4. 保存文件。 5.取文件n+1,重复步骤1到4,直到没有文件需要处理。

标签: xml powershell


【解决方案1】:

我使用[regex]::replace,它对我有用。正则表达式中的括号用于检索 $1 中的值。我还在for 循环中将您的代码-lt 替换为-le

# Regex to match RECORD CODE lines
$regEx = "\W\w{6}\s\w{4}\W.+"
$regExParen = "(\W\w{6}\s\w{4}\W.+)"
#Regex to match exactly <VATMODE>X</VATMODE>
$vatModeExists = "\W\w{7}.\w\W{2}\w{7}."

# Assign the VATMODE tag, preceding it with three tabs for proper indentation
$vatModeTag = "`t`t`t<VATMODE>X</VATMODE>"

# Get all XML files in the directory
$files = Get-ChildItem -Path "C:\Users\user1\Documents\XML_dumping" -Filter *.xml

# Get the number of XML files in the directory
$numberOfFiles = (Get-ChildItem -Path "C:\Users\user1\Documents\XML_dumping" -Filter *.xml | Measure-Object).Count
for ($i=1; $i -le $numberOfFiles; $i++) { # Loop through each file separately
    $content = (Get-Content $files[$i - 1] -raw) # Scan the contents of each file

    if ($content -match $vatModeExists) { 
    # If <VATMODE>X</VATMODE> is detected in the file...
    echo "<VATMODE>X</VATMODE>"
    break # ...then do not process the file (skip it)

}
    # replaces in $content the reg. expression with VATNUMBER
    $content= [regex]::replace($content, $regExParen, ('$1'+"`r`n"+"VATNUMBER"+"`r`n")) 
    # Insert <VATMODE>X</VATMODE> right under RECORD CODE line
    echo $content
        # save the files that now have VATMODE inserted into them, but how?
    $content | Out-File -encoding utf8 $files[$i - 1]
}

【讨论】:

  • 我不想对我的结论过于仓促,但我已经严格测试了你的脚本并且它有效。我还有一些测试要做(我正在测试),但它现在运行良好。我用$vatModeTag 替换了“VATNUMBER”并删除了+"`r`n",因为它添加了一个不需要的额外行。不过,这些只是表面上的变化,逻辑完美无缺。我将报告我的发现,接受您的回答并尽快发布整个脚本。非常感谢你! :)
  • 正如承诺的那样,我使用了您的解决方案并对其进行了一些调整,并将完整的脚本发布在帖子的顶部。再次感谢您! :)
【解决方案2】:

所以我试图想出一些东西。但是和你要求的有点不同。

# Assign the path where the XML files are getting dumped as they arrive from the server
$fileName = "*.xml"

# Assign the regular expression patterns
$regEx = "\W\w{6}\s\w{4}\W.+"
$vatModeExists = "\W\w{7}.\w\W{2}\w{7}."

# Assign the VATMODE tag, preceding it with a line break and three tabs for proper indentation
$vatModeTag = "`n`t`t`t<VATMODE>X</VATMODE>"

$Output
foreach($file in $fileName){
    if ((Get-Content $file) -notmatch $vatModeExists){
        if($file -match $regex) { # if RECORD CODE line is found
            $file += $vatModeTag # append VATMODE after each RECORD CODE line
        }
        $output += $file
    }
}
Set-Content -path SomeFile.xml -Value $output

【讨论】:

  • 我将尝试您提出的解决方案并报告我的发现。谢谢。
  • 我已经从头开始重写代码,但我无法保存修改后的文件
猜你喜欢
  • 2010-10-23
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2019-11-22
  • 2016-05-07
  • 1970-01-01
相关资源
最近更新 更多