Pattern search method, pattern search apparatus and computer program therefor, and storage medium thereof
First Claim
1. A pattern search method for searching a target character string for a desired pattern comprising:
- a range search step of obtaining intermediate patterns by adding characters in order, one by one, from the last character of said pattern to the first, and determining a range for a suffix array, which corresponds to said target character string, wherein the first character of each of said intermediate patterns is present; and
a character string extraction step of designating elements of said character string that correspond to elements included in said range of said suffix array obtained for said pattern by said search, and extracting character string segments consisting of the same number of elements as said elements of said pattern and having said elements of said character string as their first characters.
0 Assignments
0 Petitions
Accused Products
Abstract
A fast search is performed of a large text database, while suppressing an increase in the data size of the data structure used for the process. A pattern search method for searching a target character string for a desired pattern includes: a range search step and a character string extraction step. At the range search step, intermediate patterns are obtained by adding characters in order, one by one, from the last character of the pattern to the first, and a range is determined for a suffix array, which corresponds to the target character string, wherein the first character of each of the intermediate patterns is present. Then, at the character string extraction step, elements of the character string are designated that correspond to elements included in the range of the suffix array, and character string segments are extracted consisting of the same number of elements as the elements of the pattern and having the elements of the character string as their first characters.
69 Citations
15 Claims
-
1. A pattern search method for searching a target character string for a desired pattern comprising:
-
a range search step of obtaining intermediate patterns by adding characters in order, one by one, from the last character of said pattern to the first, and determining a range for a suffix array, which corresponds to said target character string, wherein the first character of each of said intermediate patterns is present; and
a character string extraction step of designating elements of said character string that correspond to elements included in said range of said suffix array obtained for said pattern by said search, and extracting character string segments consisting of the same number of elements as said elements of said pattern and having said elements of said character string as their first characters. - View Dependent Claims (2)
-
-
3. A pattern search method for finding a desired pattern in a search target array comprising the steps of:
-
finding the location, in said array, of the last element of said desired pattern; and
obtaining intermediate patterns, when said pattern consists of a plurality of elements, by adding elements one by one, beginning with an element located immediately before said last element, and determining the locations of said intermediate patterns in said array.
-
-
4. A pattern search method for searching a target character string for a desired pattern comprising:
-
a range search step of obtaining intermediate patterns by adding characters in order, one by one, from the first character of said pattern to the last, and determining a range for a prefix array, which corresponds to said target character string, wherein the last character of each of said intermediate patterns is present; and
a character string extraction step of designating elements of said character string that correspond to elements included in said range of said prefix array obtained for said pattern by said search, and extracting character string segments consisting of the same number of elements as said elements of said pattern and having said elements of said character string as their last characters.
-
-
5. A pattern search method for searching a target genetic array for a desired pattern comprising:
-
a range search step of obtaining intermediate patterns by adding elements in order, one by one, from the last element of said pattern to the first, and determining a range for a suffix array, which corresponds to said target genetic array, wherein the first element of each of said intermediate patterns is present; and
an array extraction step of designating elements of said genetic array that correspond to elements included in said range of said suffix array obtained for said pattern by said search, and extracting array segments consisting of the same number of elements as said elements of said pattern and having said elements of said genetic array as their first characters.
-
-
6. A pattern search apparatus for searching a search target character string for a desired pattern comprising:
-
a pre-processor, for constructing a data structure to search for a pattern base in a suffix array for said target character string; and
a search unit, for employing said data structure constructed by said pre-processor to search for said desired pattern, wherein, to construct said data structure, said pre-processor designates a preposition character positioned immediately before each character of said character string that corresponds to elements of said suffix array, and obtains, for each type of character forming said target character string, the number of preposition characters in the elements preceding a predetermined element in said suffix array. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A computer program, which permits a computer to conduct a search of a target array for a desired pattern, comprising:
-
a process for obtaining intermediate patterns by adding elements in order, one by one, from the last element of said pattern to the first, and for determining the range of a suffix array, which corresponds to said target array, wherein the first element of each of said intermediate patterns is present; and
a process for designating elements of said array that correspond to elements included in said range of said suffix array obtained for said pattern by said search, and for extracting an array segment that consists of the same number of elements as said elements of said pattern and that has the same elements of said array as have the first elements. - View Dependent Claims (13)
-
-
12. A computer program, which permits a computer to search a target array for a desired pattern, comprising:
-
a process for designating a preposition element positioned immediately before said elements of said array that correspond to said elements of said suffix array that is consonant with said target array;
a process for obtaining, for each type of element of said array, the number of preposition elements for elements that are located before a predetermined element in said suffix array;
a process for, based on said number of preposition elements of said elements located before said predetermined element in said suffix array, obtaining intermediate patterns by adding elements in order, one by one, from the last element of said pattern to the first, and determining a range for said suffix array, which corresponds to said target array, wherein the first element of each of said intermediate patterns is present; and
a process for designating said elements of said array that correspond to elements included in said range of said suffix array that is obtained for said desired pattern by the search, and for extracting an array segment that consists of the same number of elements as said elements of said pattern and that has the same elements of said array as have the first elements.
-
-
14. A storage medium on which input means of a computer stores a computer-readable program that permits said computer to perform the processing for searching a target array for a desired pattern, and that includes, as said processing for searching a desired pattern:
-
a process for obtaining intermediate patterns by adding elements in order, one by one, from the last element of said pattern to the first, and for determining the range of a suffix array, which corresponds to said target array, wherein the first element of each of said intermediate patterns is present; and
a process for designating elements of said array that correspond to elements included in said range of said suffix array obtained for said pattern by said search, and for extracting an array segment that consists of the same number of elements as said elements of said pattern and that has the same elements of said array as have the first elements.
-
-
15. A storage medium on which input means of a computer stores a computer-readable program that permits said computer to perform the processing for searching a target array for a desired pattern, and that includes, as said processing for searching a desired pattern:
-
a process for designating a preposition element positioned immediately before said elements of said array that correspond to said elements of said suffix array that is consonant with said target array;
a process for obtaining, for each type of element of said array, the number of preposition elements for elements that are located before a predetermined element in said suffix array;
a process for, based on said number of preposition elements of said elements located before said predetermined element in said suffix array, obtaining intermediate patterns by adding elements in order, one by one, from the last element of said pattern to the first, and determining a range for said suffix array, which corresponds to said target array, wherein the first element of each of said intermediate patterns is present; and
a process for designating said elements of said array that correspond to elements included in said range of said suffix array that is obtained for said desired pattern by the search, and for extracting an array segment that consists of the same number of elements as said elements of said pattern and that has the same elements of said array as have the first elements.
-
Specification