Character and/or character-string retrieving method and storage medium for use for this method
First Claim
1. A character and/or character-string retrieving method for simultaneously retrieving a plurality of specific patterns of characters and/or character-strings from objects to be searched, comprising:
- preparing a syntax ((r1)#1)|((r2)#2)| . . . |((rn)#n) on the basis of an augmented regular expression (r1)#1, (r2)#2 . . . , (rn)#n obtained by concatenating end-markers #1, #2, . . . #n to respective regular expressions r1, r2, . . . , rn containing a plurality of characters and/or character-strings (2, . . . , n); and
constructing a deterministic infinite automaton for simultaneously retrieving a plurality of specific patterns of characters and/or character-strings from objects to be searched by distinguishing each of the characters and/or character-strings contained in the plurality of regular expressions r1, r2, . . . rn by means of each of the end-markers #1, #2 . . . #n attached thereto for representing accepting states 1, 2, . . . n of the regular expressions respectively.
1 Assignment
0 Petitions
Accused Products
Abstract
A character and/or character-string retrieving method with retrieves a plurality of patterns at a time by using a single deterministic finite automaton prepared from a plurality of different patterns. There is also a method for optimizing the number of states for the above-mentioned retrieving method, and a storage medium having records of programs and data necessary for executing the above-mentioned character and/or character-string retrieving and a state number optimizing method. A plurality of regular expressions r1, r2, . . . , rn to be simultaneously retrieved by pattern matching are prepared, and then augmented to form an augmented regular expression ((r1)#1,)|((r2)#2)| . . . ((rn)#n). A deterministic finite automaton is constructed so that it treats states including positions corresponding to #1, #2, . . . , n, thereby simultaneously retrieving a plurality of regular expression patterns by distinguishing matches from one another.
-
Citations
15 Claims
-
1. A character and/or character-string retrieving method for simultaneously retrieving a plurality of specific patterns of characters and/or character-strings from objects to be searched, comprising:
-
preparing a syntax ((r1)#1)|((r2)#2)| . . . |((rn)#n) on the basis of an augmented regular expression (r1)#1, (r2)#2 . . . , (rn)#n obtained by concatenating end-markers #1, #2, . . . #n to respective regular expressions r1, r2, . . . , rn containing a plurality of characters and/or character-strings (2, . . . , n); and constructing a deterministic infinite automaton for simultaneously retrieving a plurality of specific patterns of characters and/or character-strings from objects to be searched by distinguishing each of the characters and/or character-strings contained in the plurality of regular expressions r1, r2, . . . rn by means of each of the end-markers #1, #2 . . . #n attached thereto for representing accepting states 1, 2, . . . n of the regular expressions respectively. - View Dependent Claims (2, 4, 5, 6, 7, 8, 9)
-
-
3. A character and/or character-string retrieving method for simultaneously retrieving a plurality of specific patterns of characters and/or character-strings from objects to be searched, comprising:
-
preparing a syntax ((r1)#)|((r2)#)| . . . |((rn)#) on the basis of an augmented regular expression (r1) #, (r2) # . . . , (rn)# by concatenating an end-marker # to respective regular expressions r1, r2, . . . , rn, containing a plurality of characters and/or character-strings (2, . . . , n); and constructing a deterministic infinite automaton for simultaneously retrieving a plurality of specific patterns of characters and/or character-strings from objects to be searched by distinguishing each of the characters and/or the character-strings contained in the plurality of regular expressions r1, r2, . . . rn by means of each of the end-markers # attached thereto for representing accepting states 1, 2, . . . n of the regular expressions respectively. - View Dependent Claims (10)
-
-
11. An article of manufacture taking the form of a computer-readable medium for simultaneously retrieving a plurality of specific patterns of characters and/or character-strings from objects to be searched, the article of manufacture comprising:
-
a syntax source code segment for preparing a syntax ((r1)#1)|((r2)#2)| . . . |((rn)#n) on the basis of an augmented regular expression (r1)#1, (r2)#2 . . . , (rn)#n obtained by concatenating end-markers #1, #2, . . . #n to respective regular expressions r1, r2, . . . rn containing a plurality of characters and/or character-strings (2, . . . , n); and an automaton source code segment for constructing a deterministic infinite automaton for simultaneously retrieving a plurality of specific patterns of characters and/or character-strings from objects to be searched by distinguishing each of the characters and/or the character-strings contained in the plurality of regular expressions r1, r2, . . . rn by means of each of the end-markers #1, #2 . . . #n attached thereto for representing accepting states 1, 2, . . . n of the regular expressions respectively. - View Dependent Claims (13, 14, 15)
-
-
12. An article of manufacture taking the form of a computer-readable medium for simultaneously retrieving a plurality of specific patterns of characters and/or character-strings from objects to be searched, the article of manufacture comprising:
-
a syntax source code segment for preparing a syntax ((r1)#)|((r2)#)| . . . |((rn)#) on the basis of an augmented regular expression (r1) #, (r2) # . . . , (rn)# by concatenating an end-marker # to respective regular expressions r1, r2, . . . , rn containing a plurality of characters and/or character-strings (2, . . . , n); and an automaton source code segment for constructing a deterministic infinite automaton for simultaneously retrieving a plurality of specific patterns of characters and/or character-strings from objects to be searched by distinguishing each of the characters and/or the character-strings contained in the plurality of regular expressions r1, r2, . . . rn by means of each of the end-markers # attached thereto for representing accepting states 1, 2, . . . n of the regular expressions respectively.
-
Specification