【发布时间】:2020-07-20 22:37:13
【问题描述】:
我目前有以下 JSON 字符串:
'{"ECommerce ":{" Shopify ":," Magento ":," WooCommerce ":," Squarespace ":},"Tools ":{" Grunt ":," Gulp ":," Vagrant ":},"Containers ":{" LXC ":," Docker ":," Rocket ":},"Digital ":{" SEO ":," Email Marketing ":," Inbound Marketing ":," Crowdfunding ":," Content Distribution ":," Display Advertising ":," Ad Planning and Buying ":," Article Writing ":," SEM ":," Customer Relationship Management ":," Viral Marketing ":," Market Research ":," Social Media ":," Affiliate Marketing ":," Lead Generation ":},"Performance ":{" LoadStorm ":," httperf ":," JMeter ":," LoadUI ":," Blazemeter ":," LoadImpact ":," Nouvola ":," LoadRunner ":," Soasta CloudTest ":},
它以某种方式在 {} 中混合了分号、引号和一个额外的大括号。我想摆脱这些,所以我可以将它转换为 Python 字典,我的问题是,有没有办法使用正则表达式来摆脱无关字符(所以 ": 和 { 个字符)在这些括号 {} 中找到(以便在第一个键“ECommerce”之后留下第一个分号)。
我已将我认为会引发 JSONDecodeError 的字符加粗:
'{"电子商务":{" Shopify ":," Magento ":," WooCommerce ":," Squarespace ":}
如果这不可能,我可以使用哪些其他方法来解决这个问题?
谢谢!
【问题讨论】:
-
为什么不修复源代码以便它生成有效的 JSON。你为清理它所做的任何事情都不可靠。
-
请为您的示例字符串显示所需的结果。如果可能,请在保留其结构的同时将示例字符串的长度减至最少。
-
好吧,只要看看那个 JSON,我就可以告诉你,像
" Shopify ":,这样的东西需要一个冒号所暗示的值,而这个值不存在。你会在那里得到一个错误。我会按照@Barmar 的建议去做,这首先解决了为什么它会给出错误的 JSON。 -
看一眼就很容易做到,因为人类擅长模式匹配。但是要对其进行编程,您必须提出与您想要修复的所有内容相匹配的具体规则,并且与应该单独处理的内容不匹配。这很难。