【发布时间】:2021-12-22 06:35:18
【问题描述】:
我正在构建一个 .net5 应用程序来抓取 RSS 提要,并且我想避免自定义字符串解析逻辑。相反,我想直接序列化 c# 对象中的 XML。我以前做过一次,我使用 xsd.exe 生成架构文件,然后从中生成 .cs 文件。但是,这一次不起作用。这是我要抓取的内容
<rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
<channel>
<item>
<title>Fire kills four newborn babies at children's hospital in India</title>
<link>http://news.sky.com/story/india-fire-kills-four-newborn-babies-at-childrens-hospital-in-madhya-pradesh-12464344</link>
<description>Four newborn babies have died after a fire broke out at a children's hospital in India, officials said.</description>
<pubDate>Tue, 09 Nov 2021 07:51:00 +0000</pubDate>
<guid>http://news.sky.com/story/india-fire-kills-four-newborn-babies-at-childrens-hospital-in-madhya-pradesh-12464344</guid>
<enclosure url="https://e3.365dm.com/21/11/70x70/skynews-india-fire-childrens-hospital_5577072.jpg?20211109081515" length="0" type="image/jpeg" />
<media:description type="html">A man carries a child out from the Kamla Nehru Children’s Hospital after a fire in the newborn care unit of the hospital killed four infants, in Bhopal, India, Monday, Nov. 8, 2021. There were 40 children in total in the unit, out of which 36 have been rescued, said Medical Education Minister Vishwas Sarang. (AP Photo) </media:description>
<media:thumbnail url="https://e3.365dm.com/21/11/70x70/skynews-india-fire-childrens-hospital_5577072.jpg?20211109081515" width="70" height="70" />
<media:content type="image/jpeg" url="https://e3.365dm.com/21/11/70x70/skynews-india-fire-childrens-hospital_5577072.jpg?20211109081515" />
...
</item>
</channel>
</rss>
到目前为止,我已经尝试过使用 xsd.exe 和这个在线工具:https://xmltocsharp.azurewebsites.net/。两者都遇到了<description> 和<media:description> 标签的问题——它试图在item 内创建第二个“描述”元素,但失败了:
- xsd.exe 执行失败并且不生成类,除非我删除其中一个。
- 在线工具生成类,但当我尝试使用它们实例化
XmlSerializer时,这些类会失败
我可以看到有两个 description 标签,但其中一个是在媒体命名空间中定义的。就 xsd 和 .net 而言,这些标签应该映射到相同的属性,这显然是一个问题。这是一个无效的 XML,还是这些工具中存在某种限制,阻止了成功的映射。除了字符串解析还有其他解决方法吗?
【问题讨论】: