【问题标题】:Nested Shift/reduce conflict in bison?野牛中的嵌套移位/减少冲突?
【发布时间】:2021-12-24 22:00:24
【问题描述】:

我是新手,我试图了解这里发生了什么,我得到了两个班次减少冲突。我有语法(如果我遗漏了什么,我可以添加所需的规则):

%start translation_unit

%token EOF      0               "end of file"

%token                      LBRA        "["
%token                      RBRA        "]"
%token                      LPAR        "("
%token                      RPAR        ")"
%token                      DOT         "."
%token                      ARROW       "->"
%token                      INCDEC      "++ or --"
%token                      COMMA       ","
%token                      AMP         "&"
%token                      STAR        "*"
%token                      ADDOP       "+ or -"
%token                      EMPH        "!"
%token                      DIVOP       "/ or %"
%token                      CMPO        "<, >, <=, or >="
%token                      CMPE        "== or !="
%token                      DAMP        "&&"
%token                      DVERT       "||"
%token                      ASGN        "="
%token                      CASS        "*=, /=, %=, +=, or -="
%token                      SEMIC       ";"
%token                      LCUR        "{"
%token                      RCUR        "}"
%token                      COLON       ":"

%token                      TYPEDEF     "typedef"
%token                      VOID        "void"
%token                      ETYPE       "_Bool, char, or int"
%token                      STRUCT      "struct"
%token                      ENUM        "enum"
%token                      CONST       "const"
%token                      IF          "if"
%token                      ELSE        "else"
%token                      DO          "do"
%token                      WHILE       "while"
%token                      FOR         "for"
%token                      GOTO        "goto"
%token                      CONTINUE    "continue"
%token                      BREAK       "break"
%token                      RETURN      "return"
%token                      SIZEOF      "sizeof"

%token<string>              IDF         "identifier"
%token<string>              TYPEIDF     "type identifier"
%token<int>                 INTLIT      "integer literal"
%token<string>              STRLIT      "string literal"

/////////////////////////////////

%%

/////////////////////////////////

primary_expression:
    IDF
    | INTLIT
    | STRLIT
    | LPAR expression RPAR
    ;

postfix_expression:
    primary_expression
    | postfix_expression LBRA expression RBRA
    | postfix_expression LPAR argument_expression_list_opt RPAR
    | postfix_expression DOT IDF
    | postfix_expression ARROW IDF
    | postfix_expression INCDEC
    ;

argument_expression_list_opt:
    %empty
    | argument_expression_list
    ;

argument_expression_list:
    assignment_expression
    | argument_expression_list COMMA assignment_expression
    ;

unary_expression:
    postfix_expression
    | INCDEC unary_expression
    | unary_operator cast_expression
    | SIZEOF LPAR type_name RPAR
    ;

unary_operator:
    AMP
    | STAR
    | ADDOP
    | EMPH
    ;

cast_expression:
    unary_expression
    ;

multiplicative_expression:
    cast_expression
    | multiplicative_expression STAR cast_expression
    | multiplicative_expression DIVOP cast_expression
    ;

additive_expression:
    multiplicative_expression
    | additive_expression ADDOP multiplicative_expression
    ;

relational_expression:
    additive_expression
    | relational_expression CMPO additive_expression
    ;

equality_expression:
    relational_expression
    | equality_expression CMPE relational_expression
    ;

logical_AND_expression:
    equality_expression
    | logical_AND_expression DAMP equality_expression
    ;

logical_OR_expression:
    logical_AND_expression
    | logical_OR_expression DVERT logical_AND_expression
    ;

assignment_expression:
    logical_OR_expression
    | unary_expression assignment_operator assignment_expression
    ;

assignment_operator:
    ASGN 
    | CASS
    ;

expression:
    assignment_expression
    ;

constant_expression:
    logical_OR_expression
    ;

declaration:
    no_leading_attribute_declaration
    ;

no_leading_attribute_declaration:
    declaration_specifiers init_declarator_list_opt SEMIC
    ;

init_declarator_list_opt:
    %empty
    | init_declarator_list
    ;

declaration_specifiers:
    declaration_specifier
    | declaration_specifier declaration_specifiers
    ;

declaration_specifier:
    storage_class_specifier
    | type_specifier_qualifier
    ;

init_declarator_list:
    init_declarator
    | init_declarator_list COMMA init_declarator
    ;

init_declarator:
    declarator
    ;

storage_class_specifier:
    TYPEDEF
    ;

type_specifier:
    VOID
    | ETYPE
    | struct_or_union_specifier
    | enum_specifier
    | typedef_name
    ;

struct_or_union_specifier:
    struct_or_union IDF LCUR member_declaration_list RCUR
    | struct_or_union IDF
    ;

struct_or_union:
    STRUCT
    ;

member_declaration_list:
    member_declaration
    | member_declaration_list member_declaration
    ;

member_declaration:
    specifier_qualifier_list member_declarator_list_opt SEMIC
    ;

member_declarator_list_opt:
    %empty
    | member_declarator_list
    ;

specifier_qualifier_list:
    type_specifier_qualifier
    | type_specifier_qualifier specifier_qualifier_list
    ;

type_specifier_qualifier:
    type_specifier
    | type_qualifier
    ;

member_declarator_list:
    member_declarator
    | member_declarator_list COMMA member_declarator
    ;

member_declarator:
    declarator
    ;

enum_specifier:
    ENUM IDF LCUR enumerator_list RCUR
    | ENUM IDF LCUR enumerator_list COMMA RCUR
    | ENUM IDF
    ;

enumerator_list:
    enumerator
    | enumerator_list COMMA enumerator
    ;

enumerator:
    ENUM
    | ENUM ASGN constant_expression
    ;

type_qualifier:
    CONST
    ;

pointer:
    STAR type_qualifier_list_opt
    | STAR type_qualifier_list_opt pointer
    ;

pointer_opt:
    %empty
    | pointer
    ;

declarator:
    pointer_opt direct_declarator
    ;

direct_declarator:
    IDF
    | LPAR declarator RPAR
    | array_declarator
    | function_declarator
    ;

abstract_declarator:
    pointer
    | pointer_opt direct_abstract_declarator
    ;

direct_abstract_declarator:
    LPAR abstract_declarator RPAR
    | array_abstract_declarator
    | function_abstract_declarator
    ;

direct_abstract_declarator_opt:
    %empty
    | direct_abstract_declarator
    ;

array_abstract_declarator:
    direct_abstract_declarator_opt LBRA assignment_expression RBRA
    ;

function_abstract_declarator:
    direct_abstract_declarator_opt LPAR parameter_type_list RPAR
    ;

array_declarator:
    direct_declarator LBRA assignment_expression RBRA
    ;

function_declarator:
    direct_declarator LPAR parameter_type_list RPAR
    ;

type_qualifier_list_opt:
    %empty
    | type_qualifier_list
    ;

type_qualifier_list:
    type_qualifier
    | type_qualifier_list type_qualifier
    ;

parameter_type_list:
    parameter_list
    ;

parameter_list:
    parameter_declaration
    | parameter_list COMMA parameter_declaration
    ;

parameter_declaration:
    declaration_specifiers declarator
    | declaration_specifiers abstract_declarator_opt
    ;
abstract_declarator_opt:
    %empty
    | abstract_declarator
    ;

type_name:
    specifier_qualifier_list abstract_declarator_opt
    ;

typedef_name:
    TYPEIDF
    ;

statement:
    matched
    | unmatched
    | other
    ;

other:
    expression_statement
    | compound_statement
    | jump_statement
    ;

matched:
    selection_statement_c
    | iteration_statement_c
    ;

unmatched:
    selection_statement_o
    | iteration_statement_o
    ;

compound_statement:
    LCUR block_item_list_opt RCUR
    ;

block_item_list_opt:
    %empty
    | block_item_list
    ;

block_item_list:
    block_item
    | block_item_list block_item
    ;

block_item:
    declaration
    | statement
    ;

expression_statement:
    expression_opt SEMIC
    ;

expression_opt:
    %empty
    | expression
    ;

selection_statement_o:
    IF LPAR expression RPAR statement
    | IF LPAR expression RPAR matched ELSE unmatched
    ;

selection_statement_c:
    IF LPAR expression RPAR matched ELSE matched
    ;

iteration_statement_o:
    WHILE LPAR expression RPAR unmatched
    | FOR LPAR expression_opt SEMIC expression_opt SEMIC expression_opt RPAR unmatched
    ;

iteration_statement_c:
    WHILE LPAR expression RPAR matched
    | DO statement WHILE LPAR expression RPAR SEMIC
    | FOR LPAR expression_opt SEMIC expression_opt SEMIC expression_opt RPAR matched
    ;

jump_statement:
    RETURN expression_opt SEMIC
    ;

translation_unit:
    external_declaration
    | translation_unit external_declaration
    ;

external_declaration:
    function_definition
    | declaration
    ;

function_definition:
    declaration_specifiers declarator compound_statement
    ;

/////////////////////////////////

%%

我遇到了这种转变/减少冲突:

State 156

   85 declarator: pointer_opt . direct_declarator
   91 abstract_declarator: pointer_opt . direct_abstract_declarator

    "("           shift, and go to state 187
    "identifier"  shift, and go to state 44

    "("       [reduce using rule 111 (direct_abstract_declarator_opt)]
    ^^conflict with previous "(" ^^
    $default  reduce using rule 111 (direct_abstract_declarator_opt)

...

State 197

   91 abstract_declarator: pointer_opt . direct_abstract_declarator

    "("  shift, and go to state 213
    "("       [reduce using rule 111 (direct_abstract_declarator_opt)]
    ^^conflict^^

    $default  reduce using rule 111 (direct_abstract_declarator_opt)


所以我将其理解为:当我得到“(”时,我可以做两件事。首先,从 direct_declarator 我得到 LPAR 声明符 RPAR,其中 LPAR 是“(”
... 或 ...
我可以减少 direct_abstract_declarator->array_abstract_declarator->direct_abstract_declarator_opt->direct_abstract_declarator->LPAR abstract_declarator RPAR,其中 LPAR 再次为“(”。所以我无法决定走哪条路正确?
我应该如何解决这个冲突?还是我完全看错了?

【问题讨论】:

  • LR 解析器在产生式的结束而不是开始时做出决定。这就是为什么它比自顶向下解析器更强大的原因。当解析器不知道它是否处于生产结束时,就会发生 shift-reduce 冲突,因为其他一些活动的生产可能会继续。
  • 你不包括可以通过野牛传递的语法。您没有提供状态图的完整文本。是什么让您认为任何人都可以为您的问题提供合理的答案?请阅读创建minimal reproducible example 的帮助;如果您的回答是“我的程序太大而无法包含在此处”,那么再读一遍,因为这不是它所要求的。
  • 如果您包含可重现的语法,我可以在其上运行 bison 并生成整个状态表。这里的关键是reproducible。可重现意味着我可以将它喂给野牛而不用编辑它。这意味着你不会仅仅因为你认为它不重要而遗漏一些东西。 Bison 需要声明所有标记(尽管您可以通过使用不需要声明的带引号的标记使事情变得更简单)。很抱歉脾气暴躁,但我最近意识到,虽然我总是要求 minimal reproducible example 并且 SO 帮助清楚地说明了那是什么,没有人遵守 .没人。
  • 我通常会猜测遗漏了什么并将其添加回来以便能够运行野牛,这确实是浪费时间。请关注minimal reproducible example 中的两个关键概念:minimalcomplete。真的不难做到。例如,如果非终端没有导致问题,只需将其设为令牌(除非它可以为空,在这种情况下您需要添加可选规则)。
  • 这不仅会帮助我们,还会帮助您。在尝试隔离问题的过程中,您很可能最终自己解决了问题。所以这是了解事物运作方式的好方法。

标签: compiler-construction bison shift-reduce-conflict shift-reduce


【解决方案1】:

创建 shift-reduce 冲突的最简单方法之一是引入一个表示可选实例的新非终结符:

something_opt
    : something
    | %empty

有时这会起作用,但更常见的是,在某些情况下something 可能是可选的,也可能不是可选的。换句话说,您有两个使用something 的不同产品:

a: something something_else "END_A"
b: something_opt something_else "END_B"

这不是模棱两可的,但它不能用一个标记前瞻来解析,因为在识别出something 之后,解析器必须决定是否将其缩减为something_opt,如果以下文本变成这样,这将是必要的成为bsomething_opt 在语义上基本上没有意义;有something 或没有something。如果没有那个细节,解析器就没有问题;它可以继续解析something_else,然后根据遇到END_AEND_B,它可以决定是减少a还是b

解决方法很简单:不用定义something_opt,而是简单地复制产生式,这样一个产生式带有something,另一个没有。

在你的情况下,有问题的可选非终端是direct_abstract_declarator_opt

direct_abstract_declarator_opt:
    %empty
    | direct_abstract_declarator
 
array_abstract_declarator:
    direct_abstract_declarator_opt LBRA assignment_expression RBRA

function_abstract_declarator:
    direct_abstract_declarator_opt LPAR parameter_type_list RPAR

这些是唯一使用它的地方,所以我们可以通过复制其他两个非终端的产品来摆脱它:

array_abstract_declarator:
    direct_abstract_declarator LBRA assignment_expression RBRA
    | LBRA assignment_expression RBRA

function_abstract_declarator:
    direct_abstract_declarator LPAR parameter_type_list RPAR
    | LPAR parameter_type_list RPAR

如果可能的话,你应该使用这个模型来编写带有可选非终结符的语法。它确实需要复制语义动作,但希望语义动作只是对 AST 节点构建器的简单调用。它会为你省去很多不必要的悲伤。

【讨论】:

  • 非常感谢您抽出宝贵时间并愿意帮助我。感谢您的解释和回答。我很高兴,在我成功纠正并完成我的语法后,我来到这里,看到它和你的解决方案一样。 :) 再次感谢您的宝贵时间和建议。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2021-11-20
  • 1970-01-01
  • 1970-01-01
  • 2013-07-09
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多