static bool _tokenize_identifier(GSDLTokenizer *self, GSDLToken *result, gunichar c, GError **err) { int length = 7; char *output = result->val = g_malloc(length); GUnicodeType type; int i = g_unichar_to_utf8(c, output); while (_peek(self, &c, err) && (c == '-' || c == '.' || g_unichar_isalpha(c) || g_unichar_isdigit(c) || (type = g_unichar_type(c)) == G_UNICODE_CURRENCY_SYMBOL || type == G_UNICODE_CONNECT_PUNCTUATION || type == G_UNICODE_LETTER_NUMBER || type == G_UNICODE_SPACING_MARK || type == G_UNICODE_NON_SPACING_MARK)) { GROW_IF_NEEDED(output = result->val, i + 5, length); _consume(self); i += g_unichar_to_utf8(c, output + i); } FAIL_IF_ERR(); output[i] = '\0'; if ( strcmp(output, "true") == 0 || strcmp(output, "on") == 0 || strcmp(output, "false") == 0 || strcmp(output, "off") == 0) { result->type = T_BOOLEAN; } else if (strcmp(output, "null") == 0) { result->type = T_NULL; } return true; }
inline void new_double(double d) { _check_pre(); m_helpers_stack.back()->new_double(d); _consume(); _check_post(); }
inline void new_null() { _check_pre(); m_helpers_stack.back()->new_null(); _consume(); _check_post(); }
inline void new_bool(bool b) { _check_pre(); m_helpers_stack.back()->new_bool(b); _consume(); _check_post(); }
static bool _tokenize_string(GSDLTokenizer *self, GSDLToken *result, GError **err) { int length = 7; gunichar c; char *output = result->val = g_malloc(length); int i = 0; while (_peek(self, &c, err) && c != '"' && c != EOF) { GROW_IF_NEEDED(output = result->val, i, length); _consume(self); if (c == '\\') { _read(self, &c, err); switch (c) { case 'n': output[i++] = '\n'; break; case 'r': output[i++] = '\r'; break; case 't': output[i++] = '\t'; break; case '"': output[i++] = '"'; break; case '\'': output[i++] = '\"'; break; case '\\': output[i++] = '\\'; break; case '\r': _read(self, &c, err); case '\n': output[i++] = '\n'; while (_peek(self, &c, err) && (c == ' ' || c == '\t')) _consume(self); break; default: i += g_unichar_to_utf8(c, output + i); } } else { i += g_unichar_to_utf8(c, output + i); } } FAIL_IF_ERR(); output[i] = '\0'; return true; }
/* * Split buffer into NULL-separated words in argv. * Returns number of words. */ int dm_split_words(char *buffer, unsigned max, unsigned ignore_comments __attribute__((unused)), char **argv) { unsigned arg; for (arg = 0; arg < max; arg++) { buffer = _consume(buffer, isspace); if (!*buffer) break; argv[arg] = buffer; buffer = _consume(buffer, _isword); if (*buffer) { *buffer = '\0'; buffer++; } } return arg; }
static bool _tokenize_binary(GSDLTokenizer *self, GSDLToken *result, GError **err) { int length = 7; gunichar c; char *output = result->val = g_malloc(length); int i = 0; while (_peek(self, &c, err) && c != ']' && c != EOF) { _consume(self); if (c < 256 && (isalpha((char) c) || isdigit((char) c) || strchr("+/=", (char) c))) { GROW_IF_NEEDED(output = result->val, i, length); output[i++] = (gunichar) c; } } FAIL_IF_ERR(); output[i] = '\0'; return true; }
static bool _tokenize_backquote_string(GSDLTokenizer *self, GSDLToken *result, GError **err) { int length = 7; gunichar c; char *output = result->val = g_malloc(length); int i = 0; while (_peek(self, &c, err) && c != '`' && c != EOF) { GROW_IF_NEEDED(output = result->val, i, length); _consume(self); if (c == '\r') _read(self, &c, err); i += g_unichar_to_utf8(c, output + i); } FAIL_IF_ERR(); output[i] = '\0'; return true; }
inline void new_string(std::string const& p) { namespace w = gsim::json::writer; std::string val(p); gsim::json::parser::decode (val); _check_pre(); if (state_stack.back().s != w::IN_OBJECT_AWAIT_ID) { m_helpers_stack.back()->new_string(val); _consume(); } else { helper_ptr parent = m_helpers_stack.back(); m_helpers_stack.push_back(helper_ptr(parent->new_child(val))); } _check_post(); }
/** * gsdl_tokenizer_next: * @self: A valid %GSDLTokenizer. * @result: (out callee-allocates): A %GSDLToken to initialize and fill in. * @err: (out) (allow-none): Location to store any error, may be %NULL. * * Fetches the next token from the input. Depending on the source of input, may set an error in one * of the %GSDL_SYNTAX_ERROR, %G_IO_CHANNEL_ERROR, or %G_CONVERT_ERROR domains. * * Returns: Whether a token could be successfully read. */ bool gsdl_tokenizer_next(GSDLTokenizer *self, GSDLToken **result, GError **err) { gunichar c, nc; int line; int col; retry: line = self->line; col = self->col; if (!_read(self, &c, err)) return false; if (G_UNLIKELY(c == EOF)) { *result = _maketoken(T_EOF, line, col); return true; } else if (c == '\r') { if (_peek(self, &c, err) && c == '\n') _consume(self); *result = _maketoken('\n', line, col); FAIL_IF_ERR(); return true; } else if ((c == '/' && _peek(self, &nc, err) && nc == '/') || (c == '-' && _peek(self, &nc, err) && nc == '-') || c == '#') { if (c != '#') _consume(self); while (_peek(self, &c, err) && !(c == '\n' || c == EOF)) _consume(self); goto retry; } else if (c == '/' && _peek(self, &nc, err) && nc == '*') { while (_read(self, &c, err)) { if (c == EOF) { _set_error(err, self, GSDL_SYNTAX_ERROR_UNEXPECTED_CHAR, "Unterminated comment" ); return false; } else if (c == '*' && _peek(self, &c, err) && c == '/') { _consume(self); break; } } goto retry; } else if (c < 256 && strchr("-+:;./{}=\n", (char) c)) { *result = _maketoken(c, line, col); return true; } else if (c < 256 && isdigit((char) c)) { *result = _maketoken(T_NUMBER, line, col); return _tokenize_number(self, *result, c, err); } else if (g_unichar_isalpha(c) || g_unichar_type(c) == G_UNICODE_CONNECT_PUNCTUATION || g_unichar_type(c) == G_UNICODE_CURRENCY_SYMBOL) { *result = _maketoken(T_IDENTIFIER, line, col); return _tokenize_identifier(self, *result, c, err); } else if (c == '[') { *result = _maketoken(T_BINARY, line, col); if (!_tokenize_binary(self, *result, err)) return false; REQUIRE(_read(self, &c, err)); if (c == ']') { return true; } else { _set_error(err, self, GSDL_SYNTAX_ERROR_MISSING_DELIMITER, "Missing ']'" ); return false; } } else if (c == '"') { *result = _maketoken(T_STRING, line, col); if (!_tokenize_string(self, *result, err)) return false; REQUIRE(_read(self, &c, err)); if (c == '"') { return true; } else { _set_error(err, self, GSDL_SYNTAX_ERROR_MISSING_DELIMITER, "Missing '\"'" ); return false; } } else if (c == '`') { *result = _maketoken(T_STRING, line, col); if (!_tokenize_backquote_string(self, *result, err)) return false; REQUIRE(_read(self, &c, err)); if (c == '`') { return true; } else { _set_error(err, self, GSDL_SYNTAX_ERROR_MISSING_DELIMITER, "Missing '`'" ); return false; } } else if (c == '\'') { *result = _maketoken(T_CHAR, line, col); (*result)->val = g_malloc0(4); _read(self, &c, err); if (c == '\\') { _read(self, &c, err); switch (c) { case 'n': c = '\n'; break; case 'r': c = '\r'; break; case 't': c = '\t'; break; case '"': c = '"'; break; case '\'': c = '\''; break; case '\\': c = '\\'; break; } } g_unichar_to_utf8(c, (*result)->val); REQUIRE(_read(self, &c, err)); if (c == '\'') { return true; } else { _set_error(err, self, GSDL_SYNTAX_ERROR_MISSING_DELIMITER, "Missing \"'\"" ); return false; } } else if (c == '\\' && _peek(self, &nc, err) && (nc == '\r' || nc == '\n')) { _consume(self); if (c == '\r') _read(self, &c, err); goto retry; } else if (c == ' ' || c == '\t') { // Do nothing goto retry; } else { _set_error(err, self, GSDL_SYNTAX_ERROR_UNEXPECTED_CHAR, g_strdup_printf("Invalid character '%s'(%d)", g_ucs4_to_utf8(&c, 1, NULL, NULL, NULL), c) ); return false; } }
//> Sub-tokenizers static bool _tokenize_number(GSDLTokenizer *self, GSDLToken *result, gunichar c, GError **err) { int length = 7; char *output = result->val = g_malloc(length); output[0] = c; int i = 1; while (_peek(self, &c, err) && c < 256 && isdigit(c)) { GROW_IF_NEEDED(output = result->val, i + 1, length); _consume(self); output[i++] = (gunichar) c; } FAIL_IF_ERR(); char *suffix = output + i; while (_peek(self, &c, err) && c < 256 && (isalpha(c) || isdigit(c))) { GROW_IF_NEEDED(output = result->val, i + 1, length); _consume(self); output[i++] = (gunichar) c; } FAIL_IF_ERR(); output[i] = '\0'; if (*suffix == '\0') { // Just a T_NUMBER if (c == ':') { _consume(self); result->type = T_TIME_PART; } else if (c == '/') { _consume(self); result->type = T_DATE_PART; } } else if (strcasecmp("bd", suffix) == 0) { result->type = T_DECIMAL_END; } else if (strcasecmp("d", suffix) == 0) { if (c == ':') { _consume(self); result->type = T_DAYS; } else { result->type = T_DOUBLE_END; } } else if (strcasecmp("f", suffix) == 0) { result->type = T_FLOAT_END; } else if (strcasecmp("l", suffix) == 0) { result->type = T_LONGINTEGER; } else { _set_error(err, self, GSDL_SYNTAX_ERROR_UNEXPECTED_CHAR, g_strdup_printf("Unexpected number suffix: \"%s\"", suffix)); return false; } *suffix = '\0'; return true; }
inline void array_end() { state_stack.pop_back(); _consume(); _check_post(); }
inline void object_end() { state_stack.pop_back(); _consume(); _check_post(); }