【问题标题】:Using Regex to Clean up JSON String使用正则表达式清理 JSON 字符串
【发布时间】:2020-07-20 22:37:13
【问题描述】:

我目前有以下 JSON 字符串:

'{"ECommerce ":{" Shopify ":," Magento ":," WooCommerce ":," Squarespace ":},"Tools ":{" Grunt ":," Gulp ":," Vagrant ":},"Containers ":{" LXC ":," Docker ":," Rocket ":},"Digital ":{" SEO ":," Email Marketing ":," Inbound Marketing ":," Crowdfunding ":," Content Distribution ":," Display Advertising ":," Ad Planning and Buying ":," Article Writing ":," SEM ":," Customer Relationship Management ":," Viral Marketing ":," Market Research ":," Social Media ":," Affiliate Marketing ":," Lead Generation ":},"Performance ":{" LoadStorm ":," httperf ":," JMeter ":," LoadUI ":," Blazemeter ":," LoadImpact ":," Nouvola ":," LoadRunner ":," Soasta CloudTest ":},

它以某种方式在 {} 中混合了分号、引号和一个额外的大括号。我想摆脱这些,所以我可以将它转换为 Python 字典,我的问题是,有没有办法使用正则表达式来摆脱无关字符(所以 ": 和 { 个字符)在这些括号 {} 中找到(以便在第一个键“ECommerce”之后留下第一个分号)。

我已将我认为会引发 JSONDecodeError 的字符加粗:

'{"电子商务":{" Shopify ":," Magento ":," WooCommerce ":," Squarespace ":}

如果这不可能,我可以使用哪些其他方法来解决这个问题?

谢谢!

【问题讨论】:

  • 为什么不修复源代码以便它生成有效的 JSON。你为清理它所做的任何事情都不可靠。
  • 请为您的示例字符串显示所需的结果。如果可能,请在保留其结构的同时将示例字符串的长度减至最少。
  • 好吧,只要看看那个 JSON,我就可以告诉你,像 " Shopify ":, 这样的东西需要一个冒号所暗示的值,而这个值不存在。你会在那里得到一个错误。我会按照@Barmar 的建议去做,这首先解决了为什么它会给出错误的 JSON。
  • 看一眼就很容易做到,因为人类擅长模式匹配。但是要对其进行编程,您必须提出与您想要修复的所有内容相匹配的具体规则,并且与应该单独处理的内容不匹配。这很难。

标签: json regex


【解决方案1】:

const string = '{"ECommerce ":{" Shopify ":," Magento ":," WooCommerce ":," Squarespace ":},"Tools ":{" Grunt ":," Gulp ":," Vagrant ":},"Containers ":{" LXC ":," Docker ":," Rocket ":},"Digital ":{" SEO ":," Email Marketing ":," Inbound Marketing ":," Crowdfunding ":," Content Distribution ":," Display Advertising ":," Ad Planning and Buying ":," Article Writing ":," SEM ":," Customer Relationship Management ":," Viral Marketing ":," Market Research ":," Social Media ":," Affiliate Marketing ":," Lead Generation ":},"Performance ":{" LoadStorm ":," httperf ":," JMeter ":," LoadUI ":," Blazemeter ":," LoadImpact ":," Nouvola ":," LoadRunner ":," Soasta CloudTest ":},';

const json = string
  .replace(/ /g, '') // remove excess spaces
  .replace(/(?!^){/g, '[') // replace braces (except the first) with brackets
  .replace(/}/g, ']') // replace closing braces with brackets
  .replace(/:]/g, ']') // remove erroneous colons before brackets
  .replace(/:,/g, ',') // remove erroneous colons before commas
  .replace(/.$/, '}'); // replace last comma with bracket

console.log(JSON.parse(json));

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2018-04-19
    • 2011-01-15
    • 2020-11-06
    • 1970-01-01
    • 1970-01-01
    • 2018-04-06
    • 2015-11-16
    • 1970-01-01
    相关资源
    最近更新 更多