Processing

Please wait...

Settings

Settings

Goto Application

1. WO2013159156 - METHOD FOR STORING AND APPLYING RELATED SETS OF PATTERN/MESSAGE RULES

Note: Text based on automatic Optical Character Recognition processes. Please use the PDF version for legal matters

[ EN ]

THE CLAIMS:

1. A method for generating annotations for a block of text T using a ruleset S, the method comprising the steps of:

(a) storing a plurality of rulesets containing a plurality of rules created by a plurality of entities, each-rule comprising a text pattern and a message;

(b) representing a plurality of rulesets in a data structure D that allows any ruleset R to be applied to a block of text to generate annotations such that the operation has a time complexity less than O(RT); and

(c) using D to apply a particular ruleset S to T to generate annotations.

2. The method of claim 1 wherein the data structure D includes at least one boolean vector; where step (c) of claim 1 includes matching T with at least the rules in S and at least one other rule and using the boolean vector to filter the matches.

3. The method of claim 2 wherein the boolean vectors are represented in a compressed form by compressing them independently.

4. The method of claim 2 wherein the boolean vectors are represented in a compressed form by identifying redundancies within the entire set of boolean vectors.

5. The method of claim 4 wherein each boolean vector is represented using a tree structure, where the nodes of the tree are stored in a content -addressed data structure.

6. The method of claim 2 wherein priority vectors are used instead of boolean vectors.

7. The method of claim 6 wherein the priority vectors are represented in a compressed form by compressing them independently.

8. The method of claim 6 wherein the priority vectors are represented in a compressed form by identifying redundancies within the entire set of priority vectors.

9. The method of claim 8 wherein each priority vector is represented using a tree structure, where the nodes of the tree are stored in a content-addressed data structure that stores all the priority vectors.

10. The method of claim 1 , wherein the data structure D consists of a data structure for each ruleset that allows the patterns of the ruleset to be applied to a block of text T wherein step (c) includes using the data structure corresponding to ruleset S to generate annotations.

1 1. The method of claim 10 wherein each ruleset's data structure is a tree structure whose nodes represent strings and whose arcs are labelled with strings, wherein each ruleset's tree structure can point to subtrees in other rulesets' trees to reduce duplication.

12. The method of claim 10 wherein each ruleset data structure is a hash table containing every pattern in the ruleset, wherein each hash table is stored within a digital search tree whose nodes are stored in a content-addressed store.

13. The method of claim 10 wherein each ruleset data structure is a hash table containing each pattern and its ancestor nodes, wherein each hash table is stored within a digital search tree whose nodes are stored in a content-addressed store.