遍历上下文无关语法答案

【问题标题】：Traverse Context Free Grammar遍历上下文无关语法
【发布时间】：2014-10-25 14:54:17
【问题描述】：

我面临一个问题，即遍历 prolog 环境中使用的 CFG，使其以后序方式遍历。以下是使用的语法 -

list_ast(Ls, AST) :- phrase(expression(AST), Ls).

expression(E)       --> term(T), expression_r(T, E).

expression_r(E0, E) --> [+], term(T), expression_r(E0+T, E).
expression_r(E0, E) --> [-], term(T), expression_r(E0-T, E).
expression_r(E, E)  --> [].

term(T)       --> power(P), term_r(P, T).
term_r(T0, T) --> [*], power(P), term_r(T0*P, T).
term_r(T0, T) --> [/], power(P), term_r(T0/P, T).
term_r(T, T)  --> [].

power(P)          --> factor(F), power_r(F, P).
power_r(P0, P0^P) --> [^], factor(P1), power_r(P1, P).
power_r(P, P)     --> [].

factor(N) --> [N], { number(N) }.
factor(E) --> ['('], expression(E), [')'].

当执行此语法时，会产生以下输出 -

?- list_ast([2,+,4,*,3], X).
X = 2+4*3 .

我如何更改语法以便可以在 POST-ORDER 中遍历它，以接受诸如 - ?- list_ast([2,4,3,*,+], X)。

【问题讨论】：

这可能比您预期的要容易。在 post-fix 中，您不必处理运算符优先级，因此您也不必处理括号。您需要一个在遇到运算符时弹出其顶部元素的堆栈。如果您似乎无法使其发挥作用，您至少应该自己尝试解决方案并将其添加到问题中。请参阅：stackoverflow.com/questions/15946926/… 以获取类似问题的解决方案。

标签： prolog dcg postfix-notation

【解决方案1】：

我可能应该让你自己解决，但我记得在这些事情上苦苦挣扎，所以我认为这可能对其他人有所帮助。

编辑请参阅 Wouter 的 cmets 关于我的解决方案：它不适用于减法和除法等非交换运算。

首先，我想从后缀翻译成中缀，因为这对我来说似乎更有趣。然后我可以只要求 Prolog 评估它，我不必明确地构建一个堆栈来进行评估。我认为这是 Prolog 的奇迹之一，您可以像这样操作算术表达式，而不会将它们折叠成值。

事实上，因为我们只想从右到左解析，所以我将把它翻转到波兰符号列表，并使用序列本身作为堆栈来解析它。

postfix_to_infix(Postfix, Infix) :-
    reverse(Postfix, Prefix),
    prefix_to_infix(Prefix, Infix).

从前缀到中缀的转换并没有那么糟糕，诀窍是线程化消费列表，所以我们需要另一个参数。请注意，这种线程正是 DCG 所做的，所以每当我注意到自己做了很多这样的事情时，我都会想“哎呀，我可能用 DCG 更容易做到这一点。”另一方面，在两个辅助谓词中，只有一个显式执行此线程，因此它可能没有太大帮助。我想为读者练习。

我们正在使用 univ =.. 在退出时构建 Prolog 术语。评估稍后进行。

prefix_to_infix(Seq, Expr) :-
    prefix_to_infix(Seq, Expr, []).

% base case: peel off a number
prefix_to_infix([Val|Xs], Val, Xs) :- number(Val).

% inductive case
prefix_to_infix([Op|Rest], Expr, Remainder) :-
    atom(Op),
    % threading Rest -> Rem1 -> Remainder
    prefix_to_infix(Rest, Left, Rem1),
    prefix_to_infix(Rem1, Right, Remainder),
    Expr =.. [Op, Left, Right].

让我们看看它的实际效果以及评估：

?- postfix_to_infix([2,4,3,7,*,+,*],Term), Res is Term.
Term = (7*3+4)*2,
Res = 50 ;
false.

让我们通过以保留含义的方式移动运算符和文字来置换列表，以确保解析不会做任何完全愚蠢的事情。

?- postfix_to_infix([2,3,7,*,4,+,*],Term), Res is Term.
Term = (4+7*3)*2,
Res = 50 ;
false.

?- postfix_to_infix([3,7,*,4,+,2,*],Term), Res is Term.
Term = 2* (4+7*3),
Res = 50 ;
false.

现在让我们确保当我们拥有太多或太少时它正确地失败。

?- postfix_to_infix([3,7,*,4,+,2,*,3],Term), Res is Term.
false.

?- postfix_to_infix([3,7,*,4,+,2,*,+],Term), Res is Term.
false.

?- postfix_to_infix([7,*], Term), Res is Term.
false.

看起来它对我有用。

希望这会有所帮助！

【讨论】：

【解决方案2】：

我的实现在以下方面与 Daniel 的不同：

适用于非交换运算符，例如 - 和 //。我相信后缀表示法与波兰表示法并不完全相反。例如，波兰表示法中的- 1 2 对应于后缀表示法中的1 2 -，而不是2 1 -。
适用于一元运算符。不幸的是，Prolog 中不会出现 arity >2 的运算符，但实现也可以处理这些运算符。

代码

RPN 代表 Reverse Polish Notation，也称为 Postfix Notation。

%! rpn(+Notation:list(atomic), -Outcome:number) is det.

rpn(Notation, Outcome):-
  rpn(Notation, [], Outcome).

rpn([], [Outcome], Outcome):-
  number(Outcome).
% Push operands onto the stack.
rpn([Operand|Notation], Stack, Outcome):-
  number(Operand), !,
  rpn(Notation, [Operand|Stack], Outcome).
% Evaluate n-ary operators w.r.t. the top n operands on the stack.
rpn([Op|Notation], Stack, Outcome):-
  % Notice that there can be multiple operators with the same name.
  current_op(_, OpType, Op),
  op_type_arity(OpType, OpArity),

  % Select the appropriate operands.
  length(OperandsRev, OpArity),
  append(OperandsRev, NewStack, Stack),

  % Apply the operator to its operands.
  reverse(OperandsRev, Operands),
  Expression =.. [Op|Operands],
  Result is Expression,

  rpn(Notation, [Result|NewStack], Outcome).

op_type_arity(fx,  1).
op_type_arity(fy,  1).
op_type_arity(xf,  1).
op_type_arity(xfx, 2).
op_type_arity(xfy, 2).
op_type_arity(yf,  1).
op_type_arity(yfx, 2).

使用示例

?- rpn([5,1,2,+,4,*,+,3,-], X).
X = 14.

事后思考

我特别喜欢 Daniel 使用 is/2 来评估结果，因此他的主要任务是转换为中缀符号。我的实现还使用了当前的运算符声明（即op/3），而不是通过current_op/3 使用is/2。

由于 Prolog 定义了多个具有相同名称的运算符，我的方法可能会给出模棱两可的结果：

?- rpn([1,2,+,-], X).
X = -1 ;
X = -3 ;
false.

另一个例子：丹尼尔的算法失败了：

?- rpn([3,7,*,4,+,2,*,+], X).
X = 29 ;
X = 50 ;
false.

这种歧义在“官方”后缀表示法中可能是不允许的（尽管我喜欢它）。它很容易通过只取给定运算符名称出现的最大数量来限制。

【讨论】：

感谢您的收获，哇！我完全忘记了current_op/3，这太酷了。