【问题标题】:Complex assignments with comma separator带有逗号分隔符的复杂赋值
【发布时间】:2021-08-30 20:12:04
【问题描述】:

我有一系列将传递给函数的字符串,并且该函数必须返回一个数组。该字符串是要在 bash 上导出的一系列变量,其中一些变量可能是 json。这是可能的字符串列表作为示例和预期结果:

string return desc
ONE=one [ "ONE=one" ] Array of one element
ONE="{}" [ 'ONE="{}"' ] Array of one element with quoted value.
ONE='{}' [ "ONE='{}'" ] Array of one element with simple quoted value
ONE='{attr: \"value\"}' [ "ONE='{attr: \\"value\\"}'" ] Array of one element
ONE='{attr1: \"value\", attr2:\"value attr 2\"}' [ "ONE='{attr1: \\"value\\", attr2:\\"value attr 2\\"}'" ] Array of one element and json inside with multiples values
ONE=one,TWO=two [ "ONE=one", "TWO=two" ] Array of two elements
ONE=one, TWO=two [ "ONE=one", "TWO=two" ] Array of two elements (Ignoring space after comma)
ONE='{}', TWO=two [ "ONE='{}', TWO=two" ] Array of two elements, one quoted
ONE='{}',TWO='{}',THREE='{}' [ "ONE='{}'", "TWO='{}'", "THREE='{}'" ] Array of three elements
ONE='{}', TWO=two, THREE=three [ "ONE='{}',", "TWO=two", "THREE=three" ] Array of three elements, one quoted

我怎样才能获得正确的正则表达式或过程来获得每个人的预期结果?

这就是我所拥有的:

    function parseVars(envString) {
        let matches = envArg.matchAll(/([A-Za-z][A-Za-z0-9]+=(["']?)((?:\\\2|(?:(?!\2)).)*)(\2))(\,\s?)?/g);
        let ret = [];
        for (const match of matches) {
            ret.push(match[1].trim())
        }
        return ret;
    }

和测试:

    describe("parseVars function", () => {
        it("should be one simple variable", () => {
            expect(parseVars("ONE=one")).toMatchObject([
                "ONE=one"
            ]);
        });
        it("should be two simple variable", () => {
            expect(parseVars("ONE=one,TWO=two")).toMatchObject([
                "ONE=one",
                "TWO=two"
            ]);
        });
        it("should be two simple variable (Trim space)", () => {
            expect(parseVars("ONE=one, TWO=two")).toMatchObject([
                "ONE=one",
                "TWO=two"
            ]);
        });
        it("should be simple json", () => {
            expect(parseVars("ONE='{}'")).toMatchObject([
                "ONE='{}'",
            ]);
        });
        it("should be three simple json", () => {
            expect(parseVars("ONE='{}',TWO='{}',THREE='{}'")).toMatchObject([
                "ONE='{}'",
                "TWO='{}'",
                "THREE='{}'",
            ]);
        });
        it("should be three simple json (Simple quote)", () => {
            expect(parseVars("ONE='{}'")).toMatchObject([
                "ONE='{}'",
            ]);
        });
        it("should be three simple json with attribute", () => {
            expect(parseVars("ONE='{attr: \"value\"}'")).toMatchObject([
                "ONE='{attr: \"value\"}'",
            ]);
        });
        it("should be complex json with multiple attributes", () => {
            expect(parseVars("ONE='{attr1: \"value\", attr2:\"value attr 2\"}'")).toMatchObject([
                "ONE='{attr1: \"value\", attr2:\"value attr 2\"}'",
            ]);
        });
    
        it("should be one json and one simple var", () => {
            expect(parseVars("ONE='{}', TWO=two")).toMatchObject([
                "ONE='{}'",
                "TWO=two",
            ]);
        });
        it("should be one json and two simple vars", () => {
            expect(parseVars("ONE='{}', TWO=two, THREE=three")).toMatchObject([
                "ONE='{}'",
                "TWO=two",
                "THREE=three",
            ]);
        });
    });

结果:

parseVars function
    ✕ should be one simple variable (4ms)
    ✕ should be two simple variable (1ms)
    ✕ should be two simple variable (Trim space)
    ✓ should be simple json (1ms)
    ✓ should be three simple json
    ✓ should be three simple json (Simple quote)
    ✓ should be three simple json with attribute
    ✓ should be complex json with multiple attributes
    ✕ should be one json and one simple var (1ms)
    ✕ should be one json and two simple vars (1ms)

【问题讨论】:

  • 从几个正则表达式开始可能会更容易,每个正则表达式都能够检测和解决您提供的表中的行条目。如果您采用这种方法,那么您可以在出现共性时将类似的测试折叠在一起。在最坏的情况下,您可以将各种表达式连接在一起。
  • envString.split(/,\s*(?=[A-Z]+=)/) 适用于您当前的测试用例,但对于 A=1, B=', C=3' 这样的输入并不安全

标签: javascript json regex


【解决方案1】:

您的正则表达式的问题是您只测试 ONE='{attr: \"value\"}' 之类的引用附件,但不允许 ONE=one

当您使用带有可选匹配 (['"]?) 的捕获组时,如果不匹配,该组仍会捕获零宽度字符。当将它与负前瞻 (?!\2) 结合使用时,一切都会失败 - 任何字符前面都有一个零宽度字符。

您只需要将引用附件测试与|[^,]* 结合起来,因此它适用于两种情况。

这是您的概念的简化版本:

/(?=\b[a-z])\w+=(?:(['"])(?:(?!\1).)*\1|[^,]*)/gi
  • 解释
(?=\b[a-z])\w+                  any word characters, but must start with an alphabetic character
=                               equal sign
(?:                             non-capturing group
    (['"])(?:\\\1|(?!\1).)*\1   a quote enclosure
    |[^,]*                      or any string that not made by comma
)

proof

const texts = [
    `ONE=one`,
    `ONE="{}"`,
    `ONE='{}'`,
    `ONE='{attr: \"value\"}'`,
    `ONE='{attr1: \"value\", attr2:\"value attr 2\"}'`,
    `ONE=one,TWO=two`,
    `ONE=one, TWO=two`,
    `ONE='{}', TWO=two`,
    `ONE='{}',TWO='{}',THREE='{}'`,
    `ONE='{}', TWO=two, THREE=three`
];

const regex = /(?=\b[a-z])\w+=(?:(['"])(?:\\\1|(?!\1).)*\1|[^,]*)/gi;

texts.forEach(text => {
  console.log(text, '=>', text.match(regex));
})

【讨论】:

    【解决方案2】:

    您也可以使用 char a-z 开始匹配,后跟可选的单词 chars。然后匹配从开头到结尾的"',或者匹配除空格或逗号之外的所有内容,而不使用环视或捕获组。

    使用/i 使用不区分大小写的匹配

    \b[a-z]\w*=(?:"[^"\\]*(?:\\.[^"\\]*)*"|\'[^\'\\]*(?:\\.[^\'\\]*)*\'|[^\s,]+)
    

    模式匹配:

    • \b 防止部分匹配的单词边界
    • [a-z]\w*= 匹配一个 char a-z、可选单词 chars 和 =
    • (?:非捕获组
      • "[^"\\]*(?:\\.[^"\\]*)*" 匹配从 "" 不停止在逃脱的一个
      • |或者
      • \'[^\'\\]*(?:\\.[^\'\\]*)*\' 匹配从 '' 不停止在逃脱的一个
      • |或者
      • [^\s,]+ 匹配 1+ 次除空白字符或 , 之外的任何字符
    • )关闭非捕获组

    查看Regex demo

    const regex = /\b[a-z]\w*=(?:"[^"\\]*(?:\\.[^"\\]*)*"|\'[^\'\\]*(?:\\.[^\'\\]*)*\'|[^\s,]+)/gi;
    [
      `ONE=one`,
      `ONE="{}"`,
      `ONE='{}'`,
      `ONE='{attr: \"value\"}'`,
      `ONE="{attr: \"value\"}"`,
      `ONE='{attr1: \"value\", attr2:\"value attr 2\"}'`,
      `ONE=one,TWO=two`,
      `ONE=one, TWO=two`,
      `ONE='{}', TWO=two`,
      `ONE='{}',TWO='{}',THREE='{}'`,
      `ONE='{}', TWO=two, THREE=three`
    ].forEach(s => console.log(s.match(regex)))

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2013-04-13
      • 2019-01-22
      • 2018-08-22
      相关资源
      最近更新 更多