用一个小 PLY 程序探讨了这个问题,我认为您的问题与 handling raw and non-raw strings 在数据处理方面的差异有关,而不是与 PLY 解析和词法匹配本身有关。 (顺便说一句,python V2 and python v3 在这个字符串处理领域存在细微差别。我已将我的代码限制为 python v2)。
如果您使用非原始字符串或使用input 而不是raw_input,您只会收到您看到的错误。这从我的示例代码和下面的结果中显示:
命令:
$ python --version
Python 2.7.5
$ python string.py
import sys
if ".." not in sys.path: sys.path.insert(0,"..")
import ply.lex as lex
tokens = (
'NORMSTRING',
'VAR'
)
def t_NORMSTRING(t):
r'"([^"\n]|(\\"))*"$'
print "String: '%s'" % t.value
def t_VAR(t):
r'[a-zA-Z_][a-zA-Z_0-9]*'
t_ignore = ' \t\r\n'
def t_error(t):
print "Illegal character '%s'" % t.value[0]
t.lexer.skip(1)
lexer = lex.lex()
data = r'"I do not know what \"A\" is"'
print "Data: '%s'" % data
lexer.input(data)
while True:
tok = lexer.token()
if not tok: break
print tok
输出:
Data: '"I do not know what \"A\" is"'
String: '"I do not know what \"A\" is"'
data = '"I do not know what \"A\" is"'
print "Data: '%s'" % data
lexer.input(data)
while True:
tok = lexer.token()
if not tok: break
print tok
输出:
Data: '"I do not know what "A" is"'
Illegal character '"'
Illegal character '"'
String: '" is"'
lexer.input(raw_input("Please type your line: "));
while True:
tok = lexer.token()
if not tok: break
print tok
输出:
Please type your line: "I do not know what \"A\" is"
String: '"I do not know what \"A\" is"'
lexer.input(input("Please type your line: "));
while True:
tok = lexer.token()
if not tok: break
print tok
输出:
Please type your line: "I do not know what \"A\" is"
Illegal character '"'
Illegal character '"'
最后一点,您的正则表达式中可能不需要字符串锚$。