【问题标题】:XPath query to find (lookup) another element by IDXPath 查询以按 ID 查找(查找)另一个元素
【发布时间】:2017-10-11 12:29:39
【问题描述】:

我正在编写使用 XPath 查询解析 XML 文件的类。 XML 可能看起来有点像这样:

<?xml version="1.0" encoding="UTF-8"?>
<Doc>
    <Name id="aa">Alice</Name>
    <Name id="bb">Bob</Name>
    <Name id="cc">Candice</Name>
    <Person nameid="aa"></Person>
    <Person nameid="bb"></Person>
    <Person nameid="aa"></Person>
</Doc>

想要的输出是:

Alice
Bob
Alice

我正在使用 C# 来解析人物:

// these are dynanically defined elsewhere.
const string personXPath = "/Doc/Person";
const string nameXPath = "/Doc/Name[@id=current()/@nameid]"; // <== modify this line

void ParseXDocument(XDocument doc)
{
    foreach (var personElement in doc.XPathSelectElements(personXPath))
    {
        var nameElement = personElement.XPathSelectElement(nameXPath);
        Console.WriteLine(nameElement.Value);
    }
}

这是否可能仅通过修改 nameXPath 变量来实现? (我的软件不应该“知道” XML 结构,唯一将 XML 映射到我自己的类的是 x 路径,它们是可配置的。)

另一个例子:

[TestMethod]
public void TestLibrary()
{
    string xmlFromMessage = @"<Library>
        <Writer ID=""writer1""><Name>Shakespeare</Name></Writer>
        <Writer ID=""writer2""><Name>Tolkien</Name></Writer>
        <Book><WriterRef REFID=""writer1"" /><Title>Sonnet 18</Title></Book>
        <Book><WriterRef REFID=""writer2"" /><Title>The Hobbit</Title></Book>
        <Book><WriterRef REFID=""writer2"" /><Title>Lord of the Rings</Title></Book>
         </Library>"; 

    var titleXPathFromConfigurationFile = "./Title"; 
    var writerXPathFromConfigurationFile = "??? what to put here ???";

    var library = ExtractBooks(xmlFromMessage, titleXPathFromConfigurationFile, writerXPathFromConfigurationFile).ToDictionary(b => b.Key, b => b.Value);

    Assert.AreEqual("Shakespeare", library["Sonnet 18"]);
    Assert.AreEqual("Tolkien", library["The Hobbit"]);
    Assert.AreEqual("Tolkien", library["Lord of the Rings"]);
}

public IEnumerable<KeyValuePair<string,string>> ExtractBooks(string xml, string titleXPath,  string writerXPath)
{
    var library = XDocument.Parse(xml);
    foreach(var book in library.Descendants().Where(d => d.Name == "Book"))
    {
        var title = book.XPathSelectElement(titleXPath).Value;
        var writer = book.XPathSelectElement(writerXPath).Value;
        yield return new KeyValuePair<string, string>(title, writer);
    }
}

【问题讨论】:

  • 我认为不可能做你想做的事。您有两个电话 XPathSelectElements 因此有两个上下文。所以你需要像我展示的那样传递值。

标签: c# xml xpath xml-parsing lookup


【解决方案1】:

你应该把从第一个 XPath 得到的值放到第二个表达式中。

const string personXPath = "/Doc/Person";
const string nameXPath = "/Doc/Name[@id='{0}']";


foreach (var personElement in doc.XPathSelectElements(personXPath))
{
    var nameid = personElement.Attribute("nameid").Value;
    var nameElement = doc.XPathSelectElement(string.Format(nameXPath, nameid));
    Console.WriteLine(nameElement.Value);
}

【讨论】:

    【解决方案2】:

    新答案。下面的旧答案

    Sombody 指出正确:

    • .NET 不支持 XPath 2.0,句号
    • 数据模型和查询语言是分开的。

    所以我通过使用第三方 XPath 2 库 XPath2 nuget package 解决了这个问题。这允许像

    这样的表达式
    for $c in . return ../Writer[@ID=$c/WriterRef/@REFID]/Name
    

    请注意,我需要使用从书到作者的相对路径。这不起作用

    # does not work due to the absolute path
    for $c in . return /Library/Writer[@ID=$c/WriterRef/@REFID]/Name
    

    供将来参考:此代码在安装 nuget 包后有效:

    using Microsoft.VisualStudio.TestTools.UnitTesting;
    using System.Collections.Generic;
    using System.Linq;
    using System.Xml.Linq;
    using Wmhelp.XPath2;
    
    namespace My.Library
    {
        [TestClass]
        class WmhelpTests
        {
            [TestMethod]
            public void LibraryTest()
            {
                string xmlFromMessage = @"<Library>
                    <Writer ID=""writer1""><Name>Shakespeare</Name></Writer>
                    <Writer ID=""writer2""><Name>Tolkien</Name></Writer>
                    <Book><WriterRef REFID=""writer1"" /><Title>King Lear</Title></Book>
                    <Book><WriterRef REFID=""writer2"" /><Title>The Hobbit</Title></Book>
                    <Book><WriterRef REFID=""writer2"" /><Title>Lord of the Rings</Title></Book>
                </Library>";
    
                var titleXPathFromConfigurationFile = "./Title";
                var writerXPathFromConfigurationFile = "for $curr in . return ../Writer[@ID=$curr/WriterRef/@REFID]/Name";
    
                var library = ExtractBooks(xmlFromMessage, titleXPathFromConfigurationFile, writerXPathFromConfigurationFile).ToDictionary(b => b.Key, b => b.Value);
    
                Assert.AreEqual("Shakespeare", library["King Lear"]);
                Assert.AreEqual("Tolkien", library["The Hobbit"]);
                Assert.AreEqual("Tolkien", library["Lord of the Rings"]);
            }
    
            public IEnumerable<KeyValuePair<string, string>> ExtractBooks(string xml, string titleXPath, string writerXPath)
            {
                var library = XDocument.Parse(xml);
                foreach (var book in library.Descendants().Where(d => d.Name == "Book"))
                {
                    var title = book.XPath2SelectElement(titleXPath).Value;
                    var writer = book.XPath2SelectElement(writerXPath).Value;
                    yield return new KeyValuePair<string, string>(title, writer);
                }
            }
        }
    }
    

    低于我的旧答案

    我使用了一个肮脏的修复:在我的 xpath 中,我将“current()”替换为实际值。这样当前函数的行为类似于the xslt-standard

    class MyClass
    {
    
        // these are dynanically defined elsewhere.
        const string personXPath = "/Doc/Person";
        const string nameXPath = "/Doc/Name[@id=current()/@nameid]"; 
        XElement _node;
    
        void ParseXDocument(XDocument doc)
        {
            foreach (var personElement in doc.XPathSelectElements(personXPath))
            {
                _node = personElement; // my actual code is a bit cleaner
                var nameElement = personElement.XPathSelectElement(PreParse(nameXPath));
                Console.WriteLine(nameElement.Value);
            }
        }
    
        /// <summary>
        /// Pre-evaluates calls to current()
        /// </summary>
        /// <param name="xpath"></param>
        /// <returns></returns>
        private string PreParse(string xpath)
        {
            var sb = new StringBuilder();
            foreach (var part in Tokenize(xpath))
            {
                if (part.Trim().StartsWith("current()"))
                {
                    var query = part.Replace("current()", ".");
                    sb.Append("'")
                        .Append(EvaluateXPath(query))
                        .Append("'");
                }
                else
                {
                    sb.Append(part);
                }
            }
            return sb.ToString();
        }
    
        private IEnumerable<string> Tokenize(string path)
        {
            var begin = 0;
            for (var i = 0; i < path.Length; i++)
            {
                if ("[=]".Contains(path[i]))
                {
                    yield return path.Substring(begin, i - begin);
                    yield return path[i].ToString();
                    begin = i + 1;
                }
            }
            yield return path.Substring(begin);
        }
    
        private string EvaluateXPath(string xpath)
        {
            var result = _node.XPathEvaluate(xpath);
            if (result is IEnumerable)
                foreach (var node in (IEnumerable)result)
                    return (node as XElement)?.Value ?? (node as XAttribute).Value;
            return string.Format(CultureInfo.InvariantCulture, "{0}", result);
        }
    }
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2019-07-25
      • 1970-01-01
      • 2020-06-15
      • 2018-01-25
      • 1970-01-01
      • 2020-05-02
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多