您遇到了麻烦,因为表达式模板保留了对临时对象的内部引用。
简单地聚合子解析器实例:
template <typename Iter=std::string::iterator>
struct value_path : grammar<Iter, boost::tuple<std::vector<std::string>, std::string>()> {
value_path() : value_path::base_type(start)
{
start = -(module_path_ >> '.') >> value_name_;
}
private:
rule<Iter, boost::tuple<std::vector<std::string>, std::string>()> start;
module_path<Iter> module_path_;
value_name<Iter> value_name_;
};
Notes 我觉得为这样的小项目使用单独的子语法可能是一种设计味道。尽管语法分解通常是保持构建时间可管理和代码大小稍低的好主意,但似乎 - 从这里的描述 - 你可能做得过火了。
qi::rule(实际上是类型擦除)背后的解析器表达式的“抹灰”可能会带来很大的运行时开销。如果您随后为多个迭代器类型实例化它们,则可能会导致二进制文件不必要的增长。
更新关于在 Spirit 中编写语法的惯用方式,这是我的看法:
Live On Coliru
using namespace ascii;
using qi::raw;
lowercase_ident = raw[ (lower | '_') >> *(alnum | '_' | '\'') ];
module_path_item = raw[ upper >> *(alnum | '_' | '\'') ];
module_path_ = module_path_item % '.';
auto special_char = boost::proto::deep_copy(char_("-+!$%&*./:<=>?@^|~"));
operator_name = qi::raw [
('!' >> *special_char) /* branch 1 */
| (char_("~?") >> +special_char) /* branch 2 */
| (!char_(".:") >> special_char >> *special_char) /* branch 3 */
| "mod" /* branch 4 */
| "lor" | "lsl" | "lsr" | "asr" | "or" /* branch 5-9 */
| "-." /* branch 10 doesn't match because of branch 3 */
| "!=" | "||" | "&&" | ":=" /* branch 11-14 doesn't match because of branch 1,3 */
// | (special_char - char_("!$%./:?@^|~")) /* "*+=<>&-" cannot match because of branch 3 */
]
;
value_name_ =
lowercase_ident
| '(' >> operator_name >> ')'
;
start = -(module_path_ >> '.') >> value_name_;
其中规则是声明为的字段:
qi::rule<Iter, ast::value_path(), Skipper> start;
qi::rule<Iter, ast::module_path(), Skipper> module_path_;
// lexeme: (no skipper)
qi::rule<Iter, std::string()> value_name_, module_path_item, lowercase_ident, operator_name;
注意事项:
- 我添加了一个船长,因为由于您的
value_path 语法没有使用了船长,因此您传递给 qi::phrase_parse 的任何船长都将被忽略
- 词位只是从规则声明类型中去掉了skipper,所以你甚至不需要指定
qi::lexeme[]
- 在词位中,我复制了您的意图,即使用
qi::raw 逐字复制已解析的文本。这使我们可以更简洁地编写语法(使用'!' 代替char_('!'),"mod" 代替qi::string("mod"))。请注意,在 Qi 解析器表达式的上下文中,裸文字被隐式转换为“非捕获”qi::lit(...) 节点,但由于我们无论如何都使用了raw[],lit 不捕获属性的事实不是问题。
我认为这会产生一个完美的 cromulent 语法定义,它应该满足您对“高级”的标准。语法本身有一些 wtf-y-ness(无论其表达方式如何,任何解析器生成器语言都可能):
- 我通过删除替代分支的嵌套来简化
operator_name 规则,这将产生与简化的平面替代列表相同的效果
- 我已将特殊字符的“魔法”列表重构为
special_chars
-
在替代分支 3 中,例如,我已经用否定断言记录了异常:
(!char_(".:") >> special_char >> *special_char) /* branch 3 */
!char_(".:") 断言表示:当输入与'.' 或':' 不匹配时,继续匹配(任何特殊字符序列)。实际上,您可以等效地写成:
((special_char - '.' - ':') >> *special_char) /* branch 3 */
甚至,正如我最终写的那样:
(!char_(".:") >> +special_char) /* branch 3 */
-
分支的简化实际上提高了抽象级别!现在很清楚,一些分支永远不会匹配,因为早期的分支根据定义匹配输入:
| "-." /* branch 10 doesn't match because of branch 3 */
| "!=" | "||" | "&&" | ":=" /* branch 11-14 doesn't match because of branch 1,3 */
// | (special_char - char_("!$%./:?@^|~")) /* "*+=<>&-" cannot match because of branch 3 */
我希望你能明白为什么我将这部分语法定义为“有点 wtf-y” :) 我现在假设当你将它简化为单个规则时你会感到困惑或出现问题(你的“傻瓜的差事”)。
需要注意的一些进一步改进:
- 我添加了一个适当的 AST 结构而不是
boost::tuple<> 以使代码更清晰
- 我添加了 BOOST_SPIRIT_DEBUG* 宏,以便您可以在较高级别(规则级别)调试语法
- 我已经抛弃了毯子
using namespace。这通常是一个坏主意。而对于 Spirit,这通常是一个非常糟糕的主意(它可能导致无法解决的模棱两可,或者很难发现错误)。如您所见,它并不一定会导致非常冗长的代码。
完整列表
#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/adapted.hpp>
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
namespace ast {
using module_path = std::vector<std::string>;
struct value_path {
module_path module;
std::string value_expr;
};
}
BOOST_FUSION_ADAPT_STRUCT(ast::value_path, (ast::module_path, module)(std::string,value_expr))
template <typename Iter, typename Skipper = ascii::space_type>
struct value_path : qi::grammar<Iter, ast::value_path(), Skipper> {
value_path() : value_path::base_type(start)
{
using namespace ascii;
using qi::raw;
lowercase_ident = raw[ (lower | '_') >> *(alnum | '_' | '\'') ];
module_path_item = raw[ upper >> *(alnum | '_' | '\'') ];
module_path_ = module_path_item % '.';
auto special_char = boost::proto::deep_copy(char_("-+!$%&*./:<=>?@^|~"));
operator_name = qi::raw [
('!' >> *special_char) /* branch 1 */
| (char_("~?") >> +special_char) /* branch 2 */
| (!char_(".:") >> +special_char) /* branch 3 */
| "mod" /* branch 4 */
| "lor" | "lsl" | "lsr" | "asr" | "or" /* branch 5-9 */
| "-." /* branch 10 doesn't match because of branch 3 */
| "!=" | "||" | "&&" | ":=" /* branch 11-14 doesn't match because of branch 1,3 */
// | (special_char - char_("!$%./:?@^|~")) /* "*+=<>&-" cannot match because of branch 3 */
]
;
value_name_ =
lowercase_ident
| '(' >> operator_name >> ')'
;
start = -(module_path_ >> '.') >> value_name_;
BOOST_SPIRIT_DEBUG_NODES((start)(module_path_)(value_name_)(module_path_item)(lowercase_ident)(operator_name))
}
private:
qi::rule<Iter, ast::value_path(), Skipper> start;
qi::rule<Iter, ast::module_path(), Skipper> module_path_;
// lexeme: (no skipper)
qi::rule<Iter, std::string()> value_name_, module_path_item, lowercase_ident, operator_name;
};
int main()
{
for (std::string const input : {
"Some.Module.Package.ident",
"ident",
"A.B.C_.mod", // as lowercase_ident
"A.B.C_.(mod)", // as operator_name (branch 4)
"A.B.C_.(!=)", // as operator_name (branch 1)
"(!)" // as operator_name (branch 1)
})
{
std::cout << "--------------------------------------------------------------\n";
std::cout << "Parsing '" << input << "'\n";
using It = std::string::const_iterator;
It f(input.begin()), l(input.end());
value_path<It> g;
ast::value_path data;
bool ok = qi::phrase_parse(f, l, g, ascii::space, data);
if (ok) {
std::cout << "Parse succeeded\n";
} else {
std::cout << "Parse failed\n";
}
if (f!=l)
std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
}
}
调试输出
--------------------------------------------------------------
Parsing 'Some.Module.Package.ident'
<start>
<try>Some.Module.Package.</try>
<module_path_>
<try>Some.Module.Package.</try>
<module_path_item>
<try>Some.Module.Package.</try>
<success>.Module.Package.iden</success>
<attributes>[[S, o, m, e]]</attributes>
</module_path_item>
<module_path_item>
<try>Module.Package.ident</try>
<success>.Package.ident</success>
<attributes>[[M, o, d, u, l, e]]</attributes>
</module_path_item>
<module_path_item>
<try>Package.ident</try>
<success>.ident</success>
<attributes>[[P, a, c, k, a, g, e]]</attributes>
</module_path_item>
<module_path_item>
<try>ident</try>
<fail/>
</module_path_item>
<success>.ident</success>
<attributes>[[[S, o, m, e], [M, o, d, u, l, e], [P, a, c, k, a, g, e]]]</attributes>
</module_path_>
<value_name_>
<try>ident</try>
<lowercase_ident>
<try>ident</try>
<success></success>
<attributes>[[i, d, e, n, t]]</attributes>
</lowercase_ident>
<success></success>
<attributes>[[i, d, e, n, t]]</attributes>
</value_name_>
<success></success>
<attributes>[[[[S, o, m, e], [M, o, d, u, l, e], [P, a, c, k, a, g, e]], [i, d, e, n, t]]]</attributes>
</start>
Parse succeeded
--------------------------------------------------------------
Parsing 'ident'
<start>
<try>ident</try>
<module_path_>
<try>ident</try>
<module_path_item>
<try>ident</try>
<fail/>
</module_path_item>
<fail/>
</module_path_>
<value_name_>
<try>ident</try>
<lowercase_ident>
<try>ident</try>
<success></success>
<attributes>[[i, d, e, n, t]]</attributes>
</lowercase_ident>
<success></success>
<attributes>[[i, d, e, n, t]]</attributes>
</value_name_>
<success></success>
<attributes>[[[], [i, d, e, n, t]]]</attributes>
</start>
Parse succeeded
--------------------------------------------------------------
Parsing 'A.B.C_.mod'
<start>
<try>A.B.C_.mod</try>
<module_path_>
<try>A.B.C_.mod</try>
<module_path_item>
<try>A.B.C_.mod</try>
<success>.B.C_.mod</success>
<attributes>[[A]]</attributes>
</module_path_item>
<module_path_item>
<try>B.C_.mod</try>
<success>.C_.mod</success>
<attributes>[[B]]</attributes>
</module_path_item>
<module_path_item>
<try>C_.mod</try>
<success>.mod</success>
<attributes>[[C, _]]</attributes>
</module_path_item>
<module_path_item>
<try>mod</try>
<fail/>
</module_path_item>
<success>.mod</success>
<attributes>[[[A], [B], [C, _]]]</attributes>
</module_path_>
<value_name_>
<try>mod</try>
<lowercase_ident>
<try>mod</try>
<success></success>
<attributes>[[m, o, d]]</attributes>
</lowercase_ident>
<success></success>
<attributes>[[m, o, d]]</attributes>
</value_name_>
<success></success>
<attributes>[[[[A], [B], [C, _]], [m, o, d]]]</attributes>
</start>
Parse succeeded
--------------------------------------------------------------
Parsing 'A.B.C_.(mod)'
<start>
<try>A.B.C_.(mod)</try>
<module_path_>
<try>A.B.C_.(mod)</try>
<module_path_item>
<try>A.B.C_.(mod)</try>
<success>.B.C_.(mod)</success>
<attributes>[[A]]</attributes>
</module_path_item>
<module_path_item>
<try>B.C_.(mod)</try>
<success>.C_.(mod)</success>
<attributes>[[B]]</attributes>
</module_path_item>
<module_path_item>
<try>C_.(mod)</try>
<success>.(mod)</success>
<attributes>[[C, _]]</attributes>
</module_path_item>
<module_path_item>
<try>(mod)</try>
<fail/>
</module_path_item>
<success>.(mod)</success>
<attributes>[[[A], [B], [C, _]]]</attributes>
</module_path_>
<value_name_>
<try>(mod)</try>
<lowercase_ident>
<try>(mod)</try>
<fail/>
</lowercase_ident>
<operator_name>
<try>mod)</try>
<success>)</success>
<attributes>[[m, o, d]]</attributes>
</operator_name>
<success></success>
<attributes>[[m, o, d]]</attributes>
</value_name_>
<success></success>
<attributes>[[[[A], [B], [C, _]], [m, o, d]]]</attributes>
</start>
Parse succeeded
--------------------------------------------------------------
Parsing 'A.B.C_.(!=)'
<start>
<try>A.B.C_.(!=)</try>
<module_path_>
<try>A.B.C_.(!=)</try>
<module_path_item>
<try>A.B.C_.(!=)</try>
<success>.B.C_.(!=)</success>
<attributes>[[A]]</attributes>
</module_path_item>
<module_path_item>
<try>B.C_.(!=)</try>
<success>.C_.(!=)</success>
<attributes>[[B]]</attributes>
</module_path_item>
<module_path_item>
<try>C_.(!=)</try>
<success>.(!=)</success>
<attributes>[[C, _]]</attributes>
</module_path_item>
<module_path_item>
<try>(!=)</try>
<fail/>
</module_path_item>
<success>.(!=)</success>
<attributes>[[[A], [B], [C, _]]]</attributes>
</module_path_>
<value_name_>
<try>(!=)</try>
<lowercase_ident>
<try>(!=)</try>
<fail/>
</lowercase_ident>
<operator_name>
<try>!=)</try>
<success>)</success>
<attributes>[[!, =]]</attributes>
</operator_name>
<success></success>
<attributes>[[!, =]]</attributes>
</value_name_>
<success></success>
<attributes>[[[[A], [B], [C, _]], [!, =]]]</attributes>
</start>
Parse succeeded
--------------------------------------------------------------
Parsing '(!)'
<start>
<try>(!)</try>
<module_path_>
<try>(!)</try>
<module_path_item>
<try>(!)</try>
<fail/>
</module_path_item>
<fail/>
</module_path_>
<value_name_>
<try>(!)</try>
<lowercase_ident>
<try>(!)</try>
<fail/>
</lowercase_ident>
<operator_name>
<try>!)</try>
<success>)</success>
<attributes>[[!]]</attributes>
</operator_name>
<success></success>
<attributes>[[!]]</attributes>
</value_name_>
<success></success>
<attributes>[[[], [!]]]</attributes>
</start>
Parse succeeded