GDL Antlr 语法答案

【问题标题】：GDL Antlr grammarGDL Antlr 语法
【发布时间】：2016-05-13 08:25:51
【问题描述】：

我需要一个 Java 游戏描述语言 (GDL) 解析器

为此，我目前正在尝试使用 ANTLR4。

我在下面给出的当前语法似乎不正确，或者至少生成的解析器无法识别我还将在下面提供的游戏描述。

ANTLR4-语法：

grammar GDL;

description :  (gdlRule | sentence)+ ;

gdlRule : '(' SP? '<=' SP? sentence (SP literal)* SP? ')';

sentence : propLit | ( '(' relLit ')' );

literal : ( '(' SP? (orLit | notLit | distinctLit | relLit) SP? ')' ) 
| ( '('  (orLit | notLit | distinctLit | relLit) ')' ) 
| propLit;
notLit : 'not' SP literal | '~' literal;
orLit : 'or' SP literal* ;
distinctLit : 'distinct' SP term SP term;
propLit : constant;
relLit : constant (SP term)+;

term : ( '(' funcTerm ')' ) | varTerm | constTerm;
funcTerm : constant (SP term)*;
varTerm : '?' constant;
constTerm : constant;


constant : ident | number;
/* ident is any string of letters, digits, and underscores */
ident: ID;
number: NR;
NR : [0-9]+;
ID : [a-zA-Z] [a-zA-Z0-9]* ;
SP : ' '+;

COMMENT : ';'[A-Za-z0-9; \r\t]*'\n' -> skip;
WS : [ ;\t\r\n]+ -> skip
;

GDL 中给出的游戏描述：

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Tictictoe
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

  (role white)
  (role black)

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

  (init (cell 1 1 b))
  (init (cell 1 2 b))
  (init (cell 1 3 b))
  (init (cell 2 1 b))
  (init (cell 2 2 b))
  (init (cell 2 3 b))
  (init (cell 3 1 b))
  (init (cell 3 2 b))
  (init (cell 3 3 b))
  (init (step 1))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

  (<= (next (cell ?j ?k x))
      (true (cell ?j ?k b))
      (does white (mark ?j ?k))
      (does black (mark ?m ?n))
      (or (distinct ?j ?m) (distinct ?k ?n)))

  (<= (next (cell ?m ?n o))
      (true (cell ?m ?n b))
      (does white (mark ?j ?k))
      (does black (mark ?m ?n))
      (or (distinct ?j ?m) (distinct ?k ?n)))

  (<= (next (cell ?m ?n b))
      (true (cell ?m ?n b))
      (does white (mark ?m ?n))
      (does black (mark ?m ?n)))

  (<= (next (cell ?p ?q b))
      (true (cell ?p ?q b))
      (does white (mark ?j ?k))
      (does black (mark ?m ?n))
      (or (distinct ?j ?p) (distinct ?k ?q))
      (or (distinct ?m ?p) (distinct ?n ?q)))

  (<= (next (cell ?m ?n ?w))
      (true (cell ?m ?n ?w))
      (distinct ?w b))


  (<= (next (step ?y))
      (true (step ?x))
      (succ ?x ?y))


  (succ 1 2)
  (succ 2 3)
  (succ 3 4)
  (succ 4 5)
  (succ 5 6)
  (succ 6 7)


  (<= (row ?m ?x)
      (true (cell ?m 1 ?x))
      (true (cell ?m 2 ?x))
      (true (cell ?m 3 ?x)))

  (<= (column ?n ?x)
      (true (cell 1 ?n ?x))
      (true (cell 2 ?n ?x))
      (true (cell 3 ?n ?x)))

  (<= (diagonal ?x)
      (true (cell 1 1 ?x))
      (true (cell 2 2 ?x))
      (true (cell 3 3 ?x)))

  (<= (diagonal ?x)
      (true (cell 1 3 ?x))
      (true (cell 2 2 ?x))
      (true (cell 3 1 ?x)))

  (<= (line ?x) (row ?m ?x))
  (<= (line ?x) (column ?m ?x))
  (<= (line ?x) (diagonal ?x))


  (<= nolinex
      (not (line x)))
  (<= nolineo
      (not (line o)))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

  (<= (legal white (mark ?x ?y))
      (true (cell ?x ?y b)))

  (<= (legal black (mark ?x ?y))
      (true (cell ?x ?y b)))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

  (<= (goal white 50)
      (line x)
      (line o))

  (<= (goal white 100)
      (line x)
      nolineo)

  (<= (goal white 0)
      nolinex
      (line o))

  (<= (goal white 50)
      nolinex
      nolineo)

  (<= (goal black 50)
      (line x)
      (line o))

  (<= (goal black 100)
      nolinex
      (line o))

  (<= (goal black 0)
      (line x)
      nolineo)

  (<= (goal black 50)
      nolinex
      nolineo)

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

  (<= terminal
      (true (step 7)))

  (<= terminal
      (line x))

  (<= terminal
      (line o))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

生成的解析器的错误输出：

line 24:6 mismatched input '(' expecting {')', SP}
line 27:7 no viable alternative at input '(or'

我不知道我必须改变什么或如何获得正确的语法

任何帮助将不胜感激

【问题讨论】：

标签： java parsing antlr antlr4 datalog

【解决方案1】：

（至少）3 件事不正确：

您将; 包含在您的WS 规则中，并且它是您COMMENT 的开头
您的COMMENT 规则说它需要以换行符结束。但是，WS 规则中已经包含换行符，它会禁止以EOF 结尾的 cmets（没有换行符）
不需要SP：需要跳过空格并且不包含在解析器规则中

试试这样的：

grammar GDL;

description :  (gdlRule | sentence)+ ;

gdlRule : '(' '<=' sentence literal* ')';

sentence : propLit | ( '(' relLit ')' );

literal 
 : ( '(' (orLit | notLit | distinctLit | relLit) ')' )
 | ( '('  (orLit | notLit | distinctLit | relLit) ')' )
 | propLit
 ;

notLit : 'not' literal | '~' literal;
orLit : 'or' literal* ;
distinctLit : 'distinct' term term;
propLit : constant;
relLit : constant (term)+;
term : ( '(' funcTerm ')' ) | varTerm | constTerm;
funcTerm : constant (term)*;
varTerm : '?' constant;
constTerm : constant;
constant : ident | number;
ident: ID;
number: NR;

NR : [0-9]+;
ID : [a-zA-Z] [a-zA-Z0-9]*;
COMMENT : ';'[A-Za-z0-9; \r\t]* -> skip;
WS : [ \t\r\n]+ -> skip;

【讨论】：

感谢您的帮助。这似乎比@rici 的建议工作得更好，生成的解析器没有错误输出，并且描述的解析树在快速概览中看起来不错

【解决方案2】：

问题在于您对空格的处理。

你有两条规则，其中一条创建一个令牌：

SP : ' '+;

另一个只是忽略空格：

WS : [ ;\t\r\n]+ -> skip

如果空格以空格字符开头，则将应用第一条规则，您将获得SP 标记。如果空格以换行符或WS 规则中列出的其他字符开头，则整个空格将被忽略。

由于您的语法在某些点坚持使用SP 标记，因此忽略的空格会导致语法错误。

我认为没有理由使用显式空格使您的语法复杂化。我会摆脱SP，删除语法中对它的所有引用，然后让WS 忽略空格。

我还会从WS 中删除分号，以避免与COMMENT 交互。 [注 1] 我会简化 COMMENT 以便它只忽略从分号到行尾的分号，而不是给出有效注释字符的列表。（如果您想在评论中添加逗号或 * 怎么办？）

注意事项

如果文件开头有换行符，第 2 行有分号行，您会看到这个问题。然后COMMENT 与第一个字符不匹配，但WS 匹配。然后WS 将匹配（并忽略）换行符、分号行、下一个换行符、下一行开头的分号和后面的空格，留下Tictictoe 被扫描为ID，这将导致解析错误。

如果任何其他注释不是一行分号，您也会看到它。这些当前被扫描为WS，并在评论前以换行符开头。这恰好没问题，因为评论只包含分号。但任何其他非空白字符都会终止 WS，然后被意外解析为程序文本。

【讨论】：

我删除了对SP的所有引用并更改了COMMENT和WS如下：COMMENT : ';' * '\n' -> skip; WS : [ \t\r\n]+ -> skip ;现在生成的解析器仍然无法完全忽略cmets，但实际代码似乎可以识别很好（至少没有解析器错误）line 1:0 token recognition error at: ';;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\r' line 2:0 token recognition error at: ';;; '
@S.Moenig：我对评论规则的意思是匹配任何字符，而不仅仅是分号。例如。 COMMENT : ';' ~[\r\n]* '\r'? '\n' -> skip ;（改编自Antlr4 docs）