yacc：冲突：1 减少/减少答案

【问题标题】：yacc: conflicts: 1 reduce/reduceyacc：冲突：1 减少/减少
【发布时间】：2020-05-09 20:24:56
【问题描述】：

为了学习 Lex/Yacc，我正在按照RFC 4180 第 3 页上指定的语法编写一个 CSV 解析器。

我遇到了“减少/减少冲突”，我不知道如何进行。这似乎是我的语法规则 1 和规则 3 之间的冲突，但我不知道有任何其他方式来描述最后一条记录后有或没有换行符的 CSV。此外，当我删除规则 10（空字段规则）时，减少/减少冲突消失了；但是，我需要处理空字段。

我的语法有什么问题，我应该如何纠正？

Yacc 来源

%token COMMA
%token DQUOTE
%token CRLF
%token TEXTDATA

%%

file: records CRLF
    | records;

records: records CRLF record
       | record;

record: fields;

fields: fields COMMA field
      | field;

field: DQUOTE escaped DQUOTE
     | TEXTDATA
     | ;

escaped: escaped TEXTDATA
       | escaped COMMA
       | escaped CRLF
       | escaped DQUOTE DQUOTE
       | TEXTDATA
       | COMMA
       | CRLF
       | DQUOTE DQUOTE;

`yacc -v` 输出

State 14 conflicts: 1 reduce/reduce


Grammar

    0 $accept: file $end

    1 file: records CRLF
    2     | records

    3 records: records CRLF record
    4        | record

    5 record: fields

    6 fields: fields COMMA field
    7       | field

    8 field: DQUOTE escaped DQUOTE
    9      | TEXTDATA
   10      | /* empty */

   11 escaped: escaped TEXTDATA
   12        | escaped COMMA
   13        | escaped CRLF
   14        | escaped DQUOTE DQUOTE
   15        | TEXTDATA
   16        | COMMA
   17        | CRLF
   18        | DQUOTE DQUOTE


Terminals, with rules where they appear

$end (0) 0
error (256)
COMMA (258) 6 12 16
DQUOTE (259) 8 14 18
CRLF (260) 1 3 13 17
TEXTDATA (261) 9 11 15


Nonterminals, with rules where they appear

$accept (7)
    on left: 0
file (8)
    on left: 1 2, on right: 0
records (9)
    on left: 3 4, on right: 1 2 3
record (10)
    on left: 5, on right: 3 4
fields (11)
    on left: 6 7, on right: 5 6
field (12)
    on left: 8 9 10, on right: 6 7
escaped (13)
    on left: 11 12 13 14 15 16 17 18, on right: 8 11 12 13 14


state 0

    0 $accept: . file $end

    DQUOTE    shift, and go to state 1
    TEXTDATA  shift, and go to state 2

    $default  reduce using rule 10 (field)

    file     go to state 3
    records  go to state 4
    record   go to state 5
    fields   go to state 6
    field    go to state 7


state 1

    8 field: DQUOTE . escaped DQUOTE

    COMMA     shift, and go to state 8
    DQUOTE    shift, and go to state 9
    CRLF      shift, and go to state 10
    TEXTDATA  shift, and go to state 11

    escaped  go to state 12


state 2

    9 field: TEXTDATA .

    $default  reduce using rule 9 (field)


state 3

    0 $accept: file . $end

    $end  shift, and go to state 13


state 4

    1 file: records . CRLF
    2     | records .
    3 records: records . CRLF record

    CRLF  shift, and go to state 14

    $default  reduce using rule 2 (file)


state 5

    4 records: record .

    $default  reduce using rule 4 (records)


state 6

    5 record: fields .
    6 fields: fields . COMMA field

    COMMA  shift, and go to state 15

    $default  reduce using rule 5 (record)


state 7

    7 fields: field .

    $default  reduce using rule 7 (fields)


state 8

   16 escaped: COMMA .

    $default  reduce using rule 16 (escaped)


state 9

   18 escaped: DQUOTE . DQUOTE

    DQUOTE  shift, and go to state 16


state 10

   17 escaped: CRLF .

    $default  reduce using rule 17 (escaped)


state 11

   15 escaped: TEXTDATA .

    $default  reduce using rule 15 (escaped)


state 12

    8 field: DQUOTE escaped . DQUOTE
   11 escaped: escaped . TEXTDATA
   12        | escaped . COMMA
   13        | escaped . CRLF
   14        | escaped . DQUOTE DQUOTE

    COMMA     shift, and go to state 17
    DQUOTE    shift, and go to state 18
    CRLF      shift, and go to state 19
    TEXTDATA  shift, and go to state 20


state 13

    0 $accept: file $end .

    $default  accept


state 14

    1 file: records CRLF .
    3 records: records CRLF . record

    DQUOTE    shift, and go to state 1
    TEXTDATA  shift, and go to state 2

    $end      reduce using rule 1 (file)
    $end      [reduce using rule 10 (field)]
    $default  reduce using rule 10 (field)

    record  go to state 21
    fields  go to state 6
    field   go to state 7


state 15

    6 fields: fields COMMA . field

    DQUOTE    shift, and go to state 1
    TEXTDATA  shift, and go to state 2

    $default  reduce using rule 10 (field)

    field  go to state 22


state 16

   18 escaped: DQUOTE DQUOTE .

    $default  reduce using rule 18 (escaped)


state 17

   12 escaped: escaped COMMA .

    $default  reduce using rule 12 (escaped)


state 18

    8 field: DQUOTE escaped DQUOTE .
   14 escaped: escaped DQUOTE . DQUOTE

    DQUOTE  shift, and go to state 23

    $default  reduce using rule 8 (field)


state 19

   13 escaped: escaped CRLF .

    $default  reduce using rule 13 (escaped)


state 20

   11 escaped: escaped TEXTDATA .

    $default  reduce using rule 11 (escaped)


state 21

    3 records: records CRLF record .

    $default  reduce using rule 3 (records)


state 22

    6 fields: fields COMMA field .

    $default  reduce using rule 6 (fields)


state 23

   14 escaped: escaped DQUOTE DQUOTE .

    $default  reduce using rule 14 (escaped)

【问题讨论】：

标签： parsing grammar yacc lex text-parsing

【解决方案1】：

如果输入例如TEXTDATA CRLF，则不清楚是应该派生file -> records CRLF然后派生records到单个记录还是应该派生file -> records然后派生records到两条记录，其中第二条只包含一个空字段。

为避免这种歧义，您可以删除records CRLF 替代项。以CRLF 结尾的文件仍将被接受 - 它们将被视为末尾有一个空白字段。

如果这不是你想要的，你需要重写fields，这样最后一条记录就不允许为空（然后保持file: records CRLF产生）。

PS：在不相关的注释中，在我看来，您应该将一些解析工作移至词法分析器，特别是解析引用字符串内容的部分。像"abc" 这样的东西最好通过让词法分析器将其转换为单个标记来处理。

【讨论】：

Yacc 来源

yacc -v 输出

`yacc -v` 输出