使用Boost.Spirit Qi和Lex时的空白队长(Whitespace skipper when using Boost.Spirit Qi and Lex)
我们考虑以下代码:
#include <boost/spirit/include/lex_lexertl.hpp> #include <boost/spirit/include/qi.hpp> #include <algorithm> #include <iostream> #include <string> #include <utility> #include <vector> namespace lex = boost::spirit::lex; namespace qi = boost::spirit::qi; template<typename Lexer> class expression_lexer : public lex::lexer<Lexer> { public: typedef lex::token_def<> operator_token_type; typedef lex::token_def<> value_token_type; typedef lex::token_def<> variable_token_type; typedef lex::token_def<lex::omit> parenthesis_token_type; typedef std::pair<parenthesis_token_type, parenthesis_token_type> parenthesis_token_pair_type; typedef lex::token_def<lex::omit> whitespace_token_type; expression_lexer() : operator_add('+'), operator_sub('-'), operator_mul("[x*]"), operator_div("[:/]"), value("\\d+(\\.\\d+)?"), variable("%(\\w+)"), parenthesis({ std::make_pair(parenthesis_token_type('('), parenthesis_token_type(')')), std::make_pair(parenthesis_token_type('['), parenthesis_token_type(']')) }), whitespace("[ \\t]+") { this->self = operator_add | operator_sub | operator_mul | operator_div | value | variable ; std::for_each(parenthesis.cbegin(), parenthesis.cend(), [&](parenthesis_token_pair_type const& token_pair) { this->self += token_pair.first | token_pair.second; } ); this->self("WS") = whitespace; } operator_token_type operator_add; operator_token_type operator_sub; operator_token_type operator_mul; operator_token_type operator_div; value_token_type value; variable_token_type variable; std::vector<parenthesis_token_pair_type> parenthesis; whitespace_token_type whitespace; }; template<typename Iterator, typename Skipper> class expression_grammar : public qi::grammar<Iterator, Skipper> { public: template<typename Tokens> explicit expression_grammar(Tokens const& tokens) : expression_grammar::base_type(start) { start %= expression >> qi::eoi; expression %= sum_operand >> -(sum_operator >> expression); sum_operator %= tokens.operator_add | tokens.operator_sub; sum_operand %= fac_operand >> -(fac_operator >> sum_operand); fac_operator %= tokens.operator_mul | tokens.operator_div; if(!tokens.parenthesis.empty()) fac_operand %= parenthesised | terminal; else fac_operand %= terminal; terminal %= tokens.value | tokens.variable; if(!tokens.parenthesis.empty()) { parenthesised %= tokens.parenthesis.front().first >> expression >> tokens.parenthesis.front().second; std::for_each(tokens.parenthesis.cbegin() + 1, tokens.parenthesis.cend(), [&](typename Tokens::parenthesis_token_pair_type const& token_pair) { parenthesised %= parenthesised.copy() | (token_pair.first >> expression >> token_pair.second); } ); } } private: qi::rule<Iterator, Skipper> start; qi::rule<Iterator, Skipper> expression; qi::rule<Iterator, Skipper> sum_operand; qi::rule<Iterator, Skipper> sum_operator; qi::rule<Iterator, Skipper> fac_operand; qi::rule<Iterator, Skipper> fac_operator; qi::rule<Iterator, Skipper> terminal; qi::rule<Iterator, Skipper> parenthesised; }; int main() { typedef lex::lexertl::token<std::string::const_iterator> token_type; typedef expression_lexer<lex::lexertl::lexer<token_type>> expression_lexer_type; typedef expression_lexer_type::iterator_type expression_lexer_iterator_type; typedef qi::in_state_skipper<expression_lexer_type::lexer_def> skipper_type; typedef expression_grammar<expression_lexer_iterator_type, skipper_type> expression_grammar_type; expression_lexer_type lexer; expression_grammar_type grammar(lexer); while(std::cin) { std::string line; std::getline(std::cin, line); std::string::const_iterator first = line.begin(); std::string::const_iterator const last = line.end(); bool const result = lex::tokenize_and_phrase_parse(first, last, lexer, grammar, qi::in_state("WS")[lexer.self]); if(!result) std::cout << "Parsing failed! Reminder: >" << std::string(first, last) << "<" << std::endl; else { if(first != last) std::cout << "Parsing succeeded! Reminder: >" << std::string(first, last) << "<" << std::endl; else std::cout << "Parsing succeeded!" << std::endl; } } }
它是一个带有值和变量的算术表达式的简单解析器。 它是使用
expression_lexer
构建来提取标记,然后使用expression_grammar
来解析标记。对于如此小的案例使用词法分析器似乎有点矫枉过正,可能就是一个。 但这是简化示例的成本。 另请注意,使用词法分析器可以轻松定义具有正则表达式的标记,同时允许通过外部代码(特别是用户提供的配置)轻松定义它们。 通过提供的示例,从外部配置文件读取标记的定义并且例如允许用户将变量从
%name
更改为$name
。代码似乎工作正常(在Visual Studio 2013上使用Boost 1.61进行检查)。 除了我已经注意到,如果我提供像
5++5
这样的字符串,它会正确地失败,但报告为仅提醒5
而不是+5
,这意味着违规+
被“无法恢复”消耗。 显然,生成但与语法不匹配的令牌绝不会返回到原始输入。 但那不是我要问的。 我在检查代码时意识到了这一点。现在的问题是空白跳过。 我非常不喜欢它是如何完成的。 虽然我这样做了,因为它似乎是许多例子提供的,包括StackOverflow上的问题答案。
最糟糕的事情似乎是(没有记录?)
qi::in_state_skipper
。 此外,似乎我必须添加像这样的whitespace
令牌(使用名称),而不是像所有其他的一样,因为使用lexer.whitespace
而不是"WS"
似乎不起作用。最后不得不用
Skipper
论证“混乱”语法似乎不太好。 我不应该摆脱它吗? 毕竟我想基于令牌而不是直接输入来制作语法,我希望将空格从令牌流中排除 - 它不再需要了!我还有哪些其他选项可以跳过空格? 这样做有什么好处呢?
Let's consider following code:
#include <boost/spirit/include/lex_lexertl.hpp> #include <boost/spirit/include/qi.hpp> #include <algorithm> #include <iostream> #include <string> #include <utility> #include <vector> namespace lex = boost::spirit::lex; namespace qi = boost::spirit::qi; template<typename Lexer> class expression_lexer : public lex::lexer<Lexer> { public: typedef lex::token_def<> operator_token_type; typedef lex::token_def<> value_token_type; typedef lex::token_def<> variable_token_type; typedef lex::token_def<lex::omit> parenthesis_token_type; typedef std::pair<parenthesis_token_type, parenthesis_token_type> parenthesis_token_pair_type; typedef lex::token_def<lex::omit> whitespace_token_type; expression_lexer() : operator_add('+'), operator_sub('-'), operator_mul("[x*]"), operator_div("[:/]"), value("\\d+(\\.\\d+)?"), variable("%(\\w+)"), parenthesis({ std::make_pair(parenthesis_token_type('('), parenthesis_token_type(')')), std::make_pair(parenthesis_token_type('['), parenthesis_token_type(']')) }), whitespace("[ \\t]+") { this->self = operator_add | operator_sub | operator_mul | operator_div | value | variable ; std::for_each(parenthesis.cbegin(), parenthesis.cend(), [&](parenthesis_token_pair_type const& token_pair) { this->self += token_pair.first | token_pair.second; } ); this->self("WS") = whitespace; } operator_token_type operator_add; operator_token_type operator_sub; operator_token_type operator_mul; operator_token_type operator_div; value_token_type value; variable_token_type variable; std::vector<parenthesis_token_pair_type> parenthesis; whitespace_token_type whitespace; }; template<typename Iterator, typename Skipper> class expression_grammar : public qi::grammar<Iterator, Skipper> { public: template<typename Tokens> explicit expression_grammar(Tokens const& tokens) : expression_grammar::base_type(start) { start %= expression >> qi::eoi; expression %= sum_operand >> -(sum_operator >> expression); sum_operator %= tokens.operator_add | tokens.operator_sub; sum_operand %= fac_operand >> -(fac_operator >> sum_operand); fac_operator %= tokens.operator_mul | tokens.operator_div; if(!tokens.parenthesis.empty()) fac_operand %= parenthesised | terminal; else fac_operand %= terminal; terminal %= tokens.value | tokens.variable; if(!tokens.parenthesis.empty()) { parenthesised %= tokens.parenthesis.front().first >> expression >> tokens.parenthesis.front().second; std::for_each(tokens.parenthesis.cbegin() + 1, tokens.parenthesis.cend(), [&](typename Tokens::parenthesis_token_pair_type const& token_pair) { parenthesised %= parenthesised.copy() | (token_pair.first >> expression >> token_pair.second); } ); } } private: qi::rule<Iterator, Skipper> start; qi::rule<Iterator, Skipper> expression; qi::rule<Iterator, Skipper> sum_operand; qi::rule<Iterator, Skipper> sum_operator; qi::rule<Iterator, Skipper> fac_operand; qi::rule<Iterator, Skipper> fac_operator; qi::rule<Iterator, Skipper> terminal; qi::rule<Iterator, Skipper> parenthesised; }; int main() { typedef lex::lexertl::token<std::string::const_iterator> token_type; typedef expression_lexer<lex::lexertl::lexer<token_type>> expression_lexer_type; typedef expression_lexer_type::iterator_type expression_lexer_iterator_type; typedef qi::in_state_skipper<expression_lexer_type::lexer_def> skipper_type; typedef expression_grammar<expression_lexer_iterator_type, skipper_type> expression_grammar_type; expression_lexer_type lexer; expression_grammar_type grammar(lexer); while(std::cin) { std::string line; std::getline(std::cin, line); std::string::const_iterator first = line.begin(); std::string::const_iterator const last = line.end(); bool const result = lex::tokenize_and_phrase_parse(first, last, lexer, grammar, qi::in_state("WS")[lexer.self]); if(!result) std::cout << "Parsing failed! Reminder: >" << std::string(first, last) << "<" << std::endl; else { if(first != last) std::cout << "Parsing succeeded! Reminder: >" << std::string(first, last) << "<" << std::endl; else std::cout << "Parsing succeeded!" << std::endl; } } }
It is a simple parser for arithmetic expressions with values and variables. It is build using
expression_lexer
for extracting tokens, and then withexpression_grammar
to parse the tokens.Use of lexer for such a small case might seem an overkill and probably is one. But that is the cost of simplified example. Also note that use of lexer allows to easily define tokens with regular expression while that allows to easily define them by external code (and user provided configuration in particular). With the example provided it would be no issue at all to read definition of tokens from an external config file and for example allow user to change variables from
%name
to$name
.The code seems to be working fine (checked on Visual Studio 2013 with Boost 1.61). Except that I have noticed that if I provide string like
5++5
it properly fails but reports as reminder just5
rather than+5
which means the offending+
was "unrecoverably" consumed. Apparently a token that was produced but did not match grammar is in no way returned to the original input. But that is not what I'm asking about. Just a side note I realized when checking the code.Now the problem is with whitespace skipping. I very much don't like how it is done. While I have done it this way as it seems to be the one provided by many examples including answers to questions here on StackOverflow.
The worst thing seems to be that (nowhere documented?)
qi::in_state_skipper
. Also it seems that I have to add thewhitespace
token like that (with a name) rather than like all the other ones as usinglexer.whitespace
instead of"WS"
doesn't seem to work.And finally having to "clutter" the grammar with the
Skipper
argument doesn't seem nice. Shouldn't I be free of it? After all I want to make the grammar based on tokens rather than direct input and I want the whitespace to be excluded from tokens stream - it is not needed there anymore!What other options do I have to skip whitespaces? What are advantages of doing it like it is now?
原文:https://stackoverflow.com/questions/39468278
满意答案
您仍然在文件上下文管理器内部执行重新启动(
with open("...") as text_file
部分)。 您写入的文件尚未关闭,因此尚未刷新。 在发出命令之前执行text_file.flush()
或在上下文管理器之外发出重新启动。You're issuing the reboot while still inside the file context manager (the
with open("...") as text_file
part). The file you've written to hasn't been closed yet, so it hasn't been flushed. Either dotext_file.flush()
before issuing your command or issue the reboot outside the context manager.
相关问答
更多python中os.system出错问题
Python 3 中 os.system调用问题
通过os.system强制在Python中触发的程序输入(Forcing an enter on a program fired off in Python by os.system)
Python在os.system(“sleep ...”)时如何阻塞信号?(How is Python blocking signals while os.system(“sleep…”)?)
使用os.system()或subprocess()通过Python运行Rscript(Running Rscript via Python using os.system() or subprocess())
os.system()的问题没有生成输出(issues with os.system() not generating output)
使用os.system异步,从python进程中独立?(Using os.system asynchronous, independ from the python process?)
将参数传递给os.system(Passing arguments into os.system)
os.system(/ sbin / shutdown -r now)在写入之前先执行(os.system(/sbin/shutdown -r now) executes first before writing)
cmd执行但python os.popen和subprocess.call和subprocess.popen和os.system不执行,为什么?(cmd executes but python os.popen and subprocess.call and subprocess.popen and os.system does not execute, why?)
相关文章
更多Solr boost某字段的特殊值
Lucene4:获取中文分词结果,根据文本计算boost
Lucene4:运用中文分词器创建索引,给指定文本增加boost值
lucene/solr 修改评分规则方法总结
拼音分割的思路请教
jfreechart如何去掉为null的空白占位
解决点击没有内容的空白div没有响应click事件的方法
Solr参数(DisMax Event Facet)
solr如何计算score?
Solr DisjunctionMax 注解
最新问答
更多获取MVC 4使用的DisplayMode后缀(Get the DisplayMode Suffix being used by MVC 4)
如何通过引用返回对象?(How is returning an object by reference possible?)
矩阵如何存储在内存中?(How are matrices stored in memory?)
每个请求的Java新会话?(Java New Session For Each Request?)
css:浮动div中重叠的标题h1(css: overlapping headlines h1 in floated divs)
无论图像如何,Caffe预测同一类(Caffe predicts same class regardless of image)
xcode语法颜色编码解释?(xcode syntax color coding explained?)
在Access 2010 Runtime中使用Office 2000校对工具(Use Office 2000 proofing tools in Access 2010 Runtime)
从单独的Web主机将图像传输到服务器上(Getting images onto server from separate web host)
从旧版本复制文件并保留它们(旧/新版本)(Copy a file from old revision and keep both of them (old / new revision))
Copyright ©2023 peixunduo.com All Rights Reserved.粤ICP备14003112号
本站部分内容来源于互联网,仅供学习和参考使用,请莫用于商业用途。如有侵犯你的版权,请联系我们(neng862121861#163.com),本站将尽快处理。谢谢合作!