quietfanatic/ltm
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
ltm: Little Tree Matcher A versatile tree-based regex engine. By quietfanatic, under the terms of the Artistic License. Under Construction. :) This document is more of a self-reference than a guide for (non-existent) users. Several things will be (and already are) out of date. Okay, here's what's going on. To start matching, you need an MSpec tree. Use the Create_<nodetype>() functions to make one. Use the destroy_mspec() function to recursively free any MSpec tree. Each MSpec object has a particular type, such as a character, or group, or alternation. When matched to a string, the MSpec node will produce a Match node, usually of the same type as the MSpec node but not always. Most notably, a node that fails to match will usually produce a NoMatch node. Nodes are generally named mixed case, but to produce the integer id of a node type, use the name in all caps. For example, MCharClass describes a character class node, but MCHARCLASS is the id of a MCharClass. Here are the node types, with any required arguments to their constructor: NoMatch The only match type not beginning with M. A Match of this type indicates failure to match, and an MSpec of this type will never succeed in matching. MNull This is a zero-width match. It always succeeds in matching. MBegin Zero-width. Matches at the beginning of the string. MEnd Zero-width. Matches at the end of the string. MAny Matches any one character. The only time it'll ever fail is at the end of the string. MChar (MChar_t c) Matches one character if it's equal to c. MCharClass (int nranges, MSpecCharRange ranges...) Matches one character in a class. The class is specified by multiple MSpecCharRange structs, each of which has a MChar_t .from and MChar_t .to. A range with the same .from and .to will match only that character. MGroup (int nelements, MSpec elements...) Matches its elements in a sequence. one after the other. If one fails, it'll backtrack through the ones before it to see if there are any other ways they can match. MOpt (MSpec possible) An optional match. Matches possible if possible, otherwise produces MNull rather than failing. MAlt (int nalts, MSpec alts...) Goes through alts in order and matches the first one that matches. Backtracking will make it try the next one, and so forth. MRepMax (MSpec child, int min, int max) Matches child at least min times, and at most max times. Prefers to match more rather than less, but will match less if backtracked into. MRepMin (MSpec child, int min, int max) Same as MRepMax except it matches as little as possible, matching more only if backtracked into. This and MRepMax will be consolidated, with a flag to differentiate them. MScope (MSpec child) Creates a lexical scope for captures. Any MCapture and MCaptureName nodes that are matched inside this will be directly accessible from here (by number or by name, respectively). The MSpec object will know how many captures it contains, and captures that are not matched will be NULL. This node will likely be placed automatically by whatever regex language you're running. If the Scope is not the outermost node or is not in a Capture node, it may be difficult to access. MCap (MSpec child) Captures the child match for quick access. It'll be numbered by its position in the tree of the MScope that contains it. Since a capture can be matched multiple times (due to being in a repetition), the MScope will see an array of captures. A capture not in a scope will generate an error. MNameCap (MSpec child, char* name) Captures the child match, but assigns it a name instead of a number. Multiple captures with the same name will be arrayed together. MMultiCap This is a special node for storing the results of captures matched multiple times. You can't create it yourself. MRef Pointer to another MSpec object. This won't be descended into during destruction. This allows for recursion and referencing other rules without problems. For versatility this is another node type rather than a flag. It won't produce its own Match node, it'll just collapse instead. ...UNIMPLEMENTED... MObject (MObjectSpec* object) This is an extensible type for your own matching needs. Uses a struct filled with function pointers implementing matching behavior. Details TBD.
About
Little Tree Matcher
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published