如何在 Yacc (BISON) 中打印解析器树答案

【问题标题】：How to print parser tree in Yacc (BISON)如何在 Yacc (BISON) 中打印解析器树
【发布时间】：2012-03-14 05:05:27
【问题描述】：

我使用 BISON 和 FlEX 为 C 语言制作了一个解析器。如果给定的 c 输入代码在语法上是错误的，它会在终端中工作并打印“语法错误”，否则什么也不打印。

但我想打印与给定 c 输入代码相关的解析器树作为解析器的输出。我怎么做？ BISON中是否有可以打印解析器树的功能？

【问题讨论】：

您的意思是要为您解析的表达式打印 AST？如果是这样，您将需要自己实现——我有理由确定 yacc 和 Bison 都不会为您做任何事情。

标签： parsing bison yacc

【解决方案1】：

TXR 语言 (http://www.nongnu.org/txr) 使用 Flex 和 Yacc 来解析其输入。如果你给它-v 选项，你可以看到解析树。

例如：

$ ./txr -v -c "@/[a-z]*|foo/"
spec:
(((text (#<sys:regex: 9d99268> or (0+ (set (#\a . #\z))) (compound #\f #\o #\o)))))

您在解析器操作中构建树并使用树打印例程自己打印它。我使用类似 Lisp 的对象表示来让生活更轻松。将其写出由递归打印函数处理，该函数识别所有可能的对象类型并将它们呈现为符号。例如，在上面您看到使用哈希反斜杠符号打印的字符类型对象，并且使用符号#< ... > 打印不可打印、不透明、已编译的正则表达式。

这是语法的一部分：

regexpr : regbranch                     { $$ = if3(cdr($1), 
                                                   cons(compound_s, $1),
                                                   car($1)); }
        | regexpr '|' regexpr           { $$ = list(or_s, $1, $3, nao); }
        | regexpr '&' regexpr           { $$ = list(and_s, $1, $3, nao); }
        | '~' regexpr                   { $$ = list(compl_s, $2, nao); }
        | /* empty */ %prec LOW         { $$ = nil; }
        ;

如您所见，构建 AST 在很大程度上只是简单地构建嵌套列表。这个表格编译起来很方便。基于NFA的正则表达式编译器的顶层函数可读性很强：

/*
 * Input is the items from a regex form,
 * not including the regex symbol.
 * I.e.  (rest '(regex ...)) not '(regex ...).
 */
static nfa_t nfa_compile_regex(val exp)
{
  if (nullp(exp)) {
    nfa_state_t *acc = nfa_state_accept();
    nfa_state_t *s = nfa_state_empty(acc, 0);
    return nfa_make(s, acc);
  } else if (typeof(exp) == chr_s) {
    nfa_state_t *acc = nfa_state_accept();
    nfa_state_t *s = nfa_state_single(acc, c_chr(exp));
    return nfa_make(s, acc);
  } else if (exp == wild_s) {
    nfa_state_t *acc = nfa_state_accept();
    nfa_state_t *s = nfa_state_wild(acc);
    return nfa_make(s, acc);
  } else {
    val sym = first(exp), args = rest(exp);

    if (sym == set_s) {
      return nfa_compile_set(args, nil);
    } else if (sym == cset_s) {
      return nfa_compile_set(args, t);
    } else if (sym == compound_s) {
      return nfa_compile_list(args);
    } else if (sym == zeroplus_s) {
      nfa_t nfa_arg = nfa_compile_regex(first(args));
      nfa_state_t *acc = nfa_state_accept();
      /* New start state has empty transitions going through
         the inner NFA, or skipping it right to the new acceptance state. */
      nfa_state_t *s = nfa_state_empty(nfa_arg.start, acc);
      /* Convert acceptance state of inner NFA to one which has
         an empty transition back to the start state, and
         an empty transition to the new acceptance state. */
      nfa_state_empty_convert(nfa_arg.accept, nfa_arg.start, acc);
      return nfa_make(s, acc);
    } else if (sym == oneplus_s) {
      /* One-plus case differs from zero-plus in that the new start state
         does not have an empty transition to the acceptance state.
         So the inner NFA must be traversed once. */
      nfa_t nfa_arg = nfa_compile_regex(first(args));
      nfa_state_t *acc = nfa_state_accept();
      nfa_state_t *s = nfa_state_empty(nfa_arg.start, 0); /* <-- diff */
      nfa_state_empty_convert(nfa_arg.accept, nfa_arg.start, acc);
      return nfa_make(s, acc);
    } else if (sym == optional_s) {
      /* In this case, we can keep the acceptance state of the inner
         NFA as the acceptance state of the new NFA. We simply add
         a new start state which can short-circuit to it via an empty
         transition.  */
      nfa_t nfa_arg = nfa_compile_regex(first(args));
      nfa_state_t *s = nfa_state_empty(nfa_arg.start, nfa_arg.accept);
      return nfa_make(s, nfa_arg.accept);
    } else if (sym == or_s) {
      /* Simple: make a new start and acceptance state, which form
         the ends of a spindle that goes through two branches. */
      nfa_t nfa_first = nfa_compile_regex(first(args));
      nfa_t nfa_second = nfa_compile_regex(second(args));
      nfa_state_t *acc = nfa_state_accept();
      /* New state s has empty transitions into each inner NFA. */
      nfa_state_t *s = nfa_state_empty(nfa_first.start, nfa_second.start);
      /* Acceptance state of each inner NFA converted to empty
         transition to new combined acceptance state. */
      nfa_state_empty_convert(nfa_first.accept, acc, 0);
      nfa_state_empty_convert(nfa_second.accept, acc, 0);
      return nfa_make(s, acc);
    } else {
      internal_error("bad operator in regex");
    }
  }
}

【讨论】：