tinyre ver 0.9.2

A tiny regex engine.
Plan to be compatible with "Secret Labs' Regular Expression Engine"(SRE for python).

warning: the project already works fine, but slow

Features:

utf-8 support
Cheers for unicode!
no octal number
\1 means group 1, \1-100 means group n, \01 match \1, \07 match \7, \08 match ['\0', '8'], \377 match 0o377, but \400 isn't match with 0o400 and [chr(0o40), '\0']!
What the hell ... I choose go die! Go away octal number!
custom maximum number of backtracking
An evil regex: 'a?'*n+'a'*n against 'a'*n
For example: 'a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?aaaaaaaaaaaaaaaaaaaaaaaaa' matches 'aaaaaaaaaaaaaaaaaaaaaaaaa'
It will takes a long time because of too many times of backtracking. Perl/Python/PCRE requires over 10^15 years to match a 29-character string.
You can set a limit to backtracking times to avoid this situation, and the match will be falied.
more than 100 groups ...
but who cares?

Supported:

"." Matches any character except a newline.
"^" Matches the start of the string.
"$" Matches the end of the string or just before the newline at the end of the string.
"*" Matches 0 or more (greedy) repetitions of the preceding RE. Greedy means that it will match as many repetitions as possible.
"+" Matches 1 or more (greedy) repetitions of the preceding RE.
"?" Matches 0 or 1 (greedy) of the preceding RE.
*?,+?,?? Non-greedy versions of the previous three special characters.
{m} Matches m copies of the previous RE.
{m,n} Matches from m to n repetitions of the preceding RE.
{m,n}? Non-greedy version of the above.
"\" Either escapes special characters or signals a special sequence.
"\1-N" Matches the text matched earlier by the group index.
[] Indicates a set of characters.
[^] A "^" as the first character indicates a complementing set.
"|" A|B, creates an RE that will match either A or B.
(...) Matches the RE inside the parentheses. The contents can be retrieved or matched later in the string.
(?ims) Set the I, M or S flag for the RE (see below).
(?:...) Non-grouping version of regular parentheses.
(?P...) The substring matched by the group is accessible by name.
(?P=name) Matches the text matched earlier by the group named name.
(?#...) A comment; ignored.
(?=...) Matches if ... matches next, but doesn't consume the string.
(?!...) Matches if ... doesn't match next.
(?<=...) Matches if preceded by ... (must be fixed length).
(?<!...) Matches if not preceded by ... (must be fixed length).
(?(id/name)yes|no) Matches yes pattern if the group with id/name matched, the (optional) no pattern otherwise.
\d \D \w \W \s \S
Flag: DOTALL
Flag: IGNORECASE
Flag: MULTILINE

Some of the functions in this module takes flags as optional parameters:

I IGNORECASE Perform case-insensitive matching.
M MULTILINE "^" matches the beginning of lines (after a newline) as well as the string. "$" matches the end of lines (before a newline) as well as the end of the string.
S DOTALL "." matches any character at all, including the newline.

Use

C/C++

#include "tinyre.h"

tre_Pattern* pattern;
tre_Match* match;

pattern = tre_compile("^(bb)*a", 0);
match = tre_match(pattern, "bbbbabc", 0);

// Group  0: bbbba
// Group  1: bb

Python

Edit CMakefile.txt, change build_target to py3lib, disable debug

project (tinyre)
#set(CMAKE_BUILD_TYPE Debug)
#set(build_target demo)
set(build_target py3lib)

mkdir build
cd build && cmake .. && make
cp ./_tinyre.so ../lib_py3

cd ../lib_py3
python3

import tre
tre.match("^(bb)*a", "bbbbabc")

Doc

基础设计
 TODO列表
 更新记录

License：zlib

Name		Name	Last commit message	Last commit date
Latest commit History 116 Commits
lib_py3		lib_py3
src		src
.gitignore		.gitignore
.travis.yml		.travis.yml
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lib_py3

lib_py3

src

src

.gitignore

.gitignore

.travis.yml

.travis.yml

CMakeLists.txt

CMakeLists.txt

LICENSE

LICENSE

README.md

README.md

Repository files navigation

tinyre ver 0.9.2

Use

Doc

About

Releases

Packages

Languages

License

fy0/tinyre

Folders and files

Latest commit

History

Repository files navigation

tinyre ver 0.9.2

Use

Doc

About

Resources

License

Stars

Watchers

Forks

Languages