Prolog 中是否可以进行自适应解析？答案

【问题标题】：Is adaptive parsing possible in Prolog?Prolog 中是否可以进行自适应解析？
【发布时间】：2016-08-17 12:03:53
【问题描述】：

我正在尝试在 Prolog 中编写一个adaptive parser：换句话说，一个可以在运行时修改自己的解析规则的解析器。

为了做到这一点，我需要在运行时生成新的谓词，但我不确定这是否可行。是否可以编写一个带有这样一个列表的谓词：

generate_dcg_rule([A," is greater than ",B]).

...然后生成一个像这样的新谓词？

expr(A," is greater than ",B) -->
    symbol(A)," is greater than ",symbol(B).

【问题讨论】：

标签： machine-learning prolog self-reference dcg codegen

【解决方案1】：

是的，这很容易实现。

Prolog 是一种非常动态的语言，您可以在运行时断言任意子句，例如使用 assertz/1。

您可以使用通常用于执行此操作的 Prolog 术语扩展机制将 DCG 规则扩展为普通 Prolog 规则。

例如，使用expand_term/2:

?- expand_term((expr(A," 大于 ",B) --> symbol(A)," 大于 ",symbol(B)), 子句)。子句 = (expr(A, [' ', i, s, ' ', g, r, e|...], B, _G242, _G253):-symbol(A, _G242, _G264), _G264=[' ', i, s, ' ', g|...], 符号(B, _G275, _G253))。

您可以使用assertz/1 断言此类子句：

?- expand_term((Head --> Body), Clause), assertz(Clause)。

请注意，我在此示例中使用 double_quotes 设置为 chars，即使用：

:- set_prolog_flag(double_quotes, chars)。

在您的源文件中。无论如何，这都是个好主意。

另请注意，我假设您已经找到一种方法将您在 generate_dcg_rule/1 示例中提供的列表转换为实际的 DCG。对于这种翻译，我更推荐使用像 list_dcg/2 这样的谓词，它以声明方式描述此类列表和 DCG 规则之间的关系。优点很明显：例如，您可以以交互方式和测试用例等方式测试此类关系。对于您的具体示例，定义此关系的一个子句可能类似于：

list_dcg([A,Ls,B], DCG) :- ls = "大于", DCG = (expr(A, Ls, B) --> 符号(A), Ls, 符号(B))。

我将把它概括为您的其他用例作为练习。总的来说，动态断言此类子句的一种方法是：

?- list_dcg(List, DCG), expand_term(DCG, Clause), assertz(Clause)。

请注意，在这些示例中，我们如何从 Prolog 的同音符号性质中受益：Prolog 规则和 DCG 规则具有作为 Prolog 术语的自然表示，并且我们可以简单地把它们写下来，并像在目标中的任何其他术语一样对它们进行推理。

【讨论】：

set_prolog_flag/2 既是指令又是 内置谓词，为什么要使用 initialization/1 指令，这会延迟设置标记到在包含它的源文件被编译和加载？
非常好，谢谢！我已将其更改为立即使用该指令。

【解决方案2】：

我编写了一个自适应解析器，可以将英语短语转换为数学表达式。您可以使用自己的翻译规则轻松地对其进行扩展，因此可用于创建可扩展的自然语言用户界面。

这是一种可能的输入：

(A squared) times (b squared) equals c to the power of 3 times the product of 5 and 6

这是它的输出：

(A * A) * (b * b) = c ^ 3 * 5 * 6

程序的实现如下所示：

:- initialization(main).
:- set_prolog_flag(double_quotes, chars).  % This is for SWI 7+ to revert to the prior interpretation of quoted strings.

%This is an adaptive parser for SWI-Prolog.

main :-
    %Type any kind of input here to see the output! The input must be compatible with the grammar that is defined below.        
    Input = "(A squared) times (b squared) equals c to the power of 3 times the product of 5 and 6",

    iterated_translate(Input,Output), writeln(Input), writeln(Output), writeln('\n'), writeln('\n').

%The grammar is defined here. The variables must be uppercase letters.
%The input in each translation rule is followed by its output.
theList(TheList) :-
            %You can easily extend this parser by adding more rules to this list.
            TheList = 
            [['A to the power of B',
                 'A ^ B'],
            %The next transformation is the final output of 'A to the power of B'.
            ['A ^ B',
                'A ^ B'],
            ['A * B',
                'A * B'],
            ['the product of A and B',
                'A times B'],
            ['A squared',
                 'the product of A and A'],
            ['A times B',
                'A * B'],
            ['A = B',
                'A = B'],
            ['A equals B', 'A = B']].
%This is the end of the grammar. The rest of the translator is implemented below.

output_expr(Lang,[Input,[A,B]]) -->
        {
        theList(TheList),
        list_to_output__(TheList,TheList1,[A,B]),
        member([Input,Output],
            TheList1)
        },
        input_to_output_(Lang,Input,Output).

atom_is_upper(N) :-
    atom_chars(N, [L]),
    char_type(L, upper).

atom_is_var(Names_to_vars,Atom,Var) :-
    atom(Atom),atom_is_upper(Atom),member([Atom,Var],Names_to_vars).

list_to_output__([],[],_) :- true.
list_to_output__([Start1|Rest1],[Start2|Rest2],Vars) :-
    list_to_output_(Start1,Start2,Vars),list_to_output__(Rest1,Rest2,Vars).

list_to_output_([A1_,B1_],[A2,B2],Vars) :- atomic_list_concat(A1,' ',A1_),atomic_list_concat(B1,' ',B1_),list_to_output(A1,A2,Vars),list_to_output(B1,B2,Vars).

list_to_output([],[],_) :- true.
list_to_output([Start1|Rest1],[Start2|Rest2],[A,B]) :-
    (Start1='A'->Start2=A;Start1='B'-> Start2=B;Start1=Start2),list_to_output(Rest1,Rest2,[A,B]).

list_to_grammar_(Lang,Start,Rest) -->
    {(Start = [A])->(Rest = []->Start1 = expr(Lang,A);Start1 = parentheses_expr(Lang,A));atom_chars(Start,Start1)},Start1.

list_to_grammar(Lang,[Start]) -->
    list_to_grammar_(Lang,Start,[]).

list_to_grammar(Lang,[Start|Rest]) -->
    list_to_grammar_(Lang,Start,Rest),ws_,list_to_grammar(Lang,Rest).

a_number([A,B]) -->
    (a__number(A), ".", a__number(B)).

a_number(A) -->
    a__number(A).

a__number([L|Ls]) --> digit(L), a__number_r(Ls).
a__number_r([L|Ls]) --> digit(L), a__number_r(Ls).
a__number_r([])     --> [].
digit(Let)     --> [Let], { code_type(Let, digit) }.

symbol([L|Ls]) --> letter(L), symbol_r(Ls).
symbol_r([L|Ls]) --> letter(L), symbol_r(Ls).
symbol_r([])     --> [].
letter(Let)     --> [Let], { code_type(Let, alpha) }.

ws --> "";((" ";"\n"),ws).
ws_ --> (" ";"\n"),ws.

input_to_output(Lang,A,B) -->
    {Lang = input} ->
        A;
    {Lang=output} ->
        B.

input_to_output_(Lang,A,B) -->
    {A_=list_to_grammar(Lang,A),B_=list_to_grammar(Lang,B)},input_to_output(Lang,A_,B_).

parentheses_expr(Lang,["(",A,")"]) -->
    ("(",(expr(Lang,A)),")").

parentheses_expr(_,symbol(A)) -->
    symbol(A).

parentheses_expr(_,a_number(A)) -->
    a_number(A).

expr(Lang,A) -->
    parentheses_expr(Lang,A);output_expr(Lang,A).

translate(Input1,Output1) :-
    phrase(output_expr(input,Ls),Input1),
    phrase(output_expr(output,Ls),Output1).

iterated_translate(Input1, Output2) :-
    %Keep translating until the input is the same as the output.
    translate(Input1,Output1),
    (Input1=Output1, Output1 = Output2;iterated_translate(Output1,Output2)).

基于这个例子，我写了另一个带有自然语言用户界面的adaptive grammar system。

【讨论】：

哦，是的，这很好！在input_to_output//3 中，您可以使用input_to_output(input, A, _) --> A. 和input_to_output(output, _, B) --> B. 之类的两个子句将统一拉入DCG 头部优点：更通用（如果第一个参数未实例化也可以使用）！在list_to_output/3 中，您可以简单地使用事实而不是Head :- true.，即简单地写Head.。此外，如果您将参数重新排序以将Vars 放在首位，您似乎可以从maplist/3 中受益。然后你可以说maplist(list_to_output(Vars), Ls0, Ls) 并缩短代码。感谢您分享这个！
@mat 我一直在努力使这个解析器也与左递归语法兼容。这在 ISO Prolog 中是否可行？
是的，这当然是可能的。一种方法是使用一种称为左分解的技术。以这种方式重写现有语法是一个很好的练习。另一种方法是传递一个最多可以消耗的令牌列表，并让每个 DCG 规则提前说明它自己最终将消耗的令牌数量。使用简单的定点算法，您可以将这些数字拉到每个规则的“前面”，并提前检查是否有足够的令牌使规则适用。你肯定走在正确的轨道上，在 Prolog 中完成这一切真是太好了！
@mat 另外，an Earley parser in Prolog 的实现可以用于此目的。