High-speed data searching apparatus and method capable of operation in retrospective and dissemination modes
First Claim
1. High speed data searching apparatus capable of operation in retrospective and dissemination modes, the apparatus comprising:
- a plurality of functionally identical search cells, wherein each search cell includes a plurality of registers;
means for connecting corresponding registers in each cell together to form a number of serially connected pipelines of registers through which data can be passed;
means for inputting into the apparatus a first data stream from a database to be searched; and
means for inputting into the apparatus a second data stream defining a pattern to be searched for in the database;
wherein each search cell includes means for comparing a database character with a pattern character and means for controlling further operation of the cell in response to the result of each such comparison;
and wherein some of the registers in each cell are designated to hold a database character and associated data, and others of the registers are designated to hold a pattern character and associated data;
and wherein either one of the data streams is first stored in the designated registers of the search cells, and then the other of the data streams is streamed through the search cells to perform the search, whereby the search may be performed in either a retrospective mode in which pattern characters are first stored in and database characters are then streamed through the apparatus, or in a dissemination mode in which database characters are first stored in and pattern characters are then streamed through the apparatus;
and wherein the first data stream includes a sequence of database characters and parallel sequences of associated data, one of the parallel sequences being a sequence of character marks, and wherein a character mark of a particular state indicates that the corresponding database character is to be compared with a pattern character in any cell in which both characters are present.
4 Assignments
0 Petitions
Accused Products
Abstract
A highly versatile data search engine in which multiple search cells are connected together in a pipeline through which data can be streamed. Each cell has multiple registers, and corresponding registers in each cell are connected together to form the pipeline. Characters in two data streams are compared in the pipeline, a first data stream that includes a sequence of database characters and parallel sequences of associated data, and a second data stream that includes a sequence of pattern characters and parallel sequences of associated data. The parallel data sequences associated with the pattern data include coded signals that control cell operation. One of the parallel data sequences associated with the database is a sequence of character marks, which are used to indicate search starting points in the database character sequence, and which are propagated along the database character sequence as a result of successive character matches between the pattern and database characters. Use of the character marks facilitates searches in both forward and reverse directions, and permits selected database characters to be ignored. The search engine can be operated in either a retrospective mode, in which the database is streamed through cells containing a stationary pattern, or in dissemination mode, in which patterns are streamed through cells containing a stationary database.
55 Citations
48 Claims
-
1. High speed data searching apparatus capable of operation in retrospective and dissemination modes, the apparatus comprising:
-
a plurality of functionally identical search cells, wherein each search cell includes a plurality of registers; means for connecting corresponding registers in each cell together to form a number of serially connected pipelines of registers through which data can be passed; means for inputting into the apparatus a first data stream from a database to be searched; and means for inputting into the apparatus a second data stream defining a pattern to be searched for in the database; wherein each search cell includes means for comparing a database character with a pattern character and means for controlling further operation of the cell in response to the result of each such comparison; and wherein some of the registers in each cell are designated to hold a database character and associated data, and others of the registers are designated to hold a pattern character and associated data; and wherein either one of the data streams is first stored in the designated registers of the search cells, and then the other of the data streams is streamed through the search cells to perform the search, whereby the search may be performed in either a retrospective mode in which pattern characters are first stored in and database characters are then streamed through the apparatus, or in a dissemination mode in which database characters are first stored in and pattern characters are then streamed through the apparatus; and wherein the first data stream includes a sequence of database characters and parallel sequences of associated data, one of the parallel sequences being a sequence of character marks, and wherein a character mark of a particular state indicates that the corresponding database character is to be compared with a pattern character in any cell in which both characters are present. - View Dependent Claims (2, 3, 4)
-
-
5. High speed data searching apparatus capable of operation in retrospective and dissemination modes, the apparatus comprising:
-
a plurality of functionally identical search cells, wherein each search cell includes a plurality of registers; means for connecting corresponding registers in each cell together to form a number of serially connected pipelines of registers through which data can be passed; means for inputting into the apparatus a first data stream from a database to be searched; means for inputting into the apparatus a second data stream defining a pattern to be searched for in the database; wherein each search cell includes means for comparing a database character with a pattern character and means for controlling further operation of the cell in response to the result of each such comparison; and wherein some of the registers in each cell are designated to hold a database character and associated data, and others of the registers are designated to hold a pattern character and associated data; and wherein either one of the data streams is first stored in the designated registers of the search cells, and then the other of the data streams is streamed through thee search cells to perform the search, whereby the search may be performed in either a retrospective mode in which pattern characters are first stored in and database characters are then streamed through the apparatus, or in a dissemination mode in which database characters are first stored in and pattern characters are then streamed through the apparatus; and wherein the first data stream includes a sequence of database characters and parallel sequences of associated data, one of the parallel sequences being a sequence of character marks; the second data stream includes a sequence of pattern characters and parallel sequences of coded data that determine the manner in which comparisons and other functions are performed in each element of the pattern; and a character mark of a particular state indicates that the corresponding database character is to be compared with a pattern character in any cell in which both characters are present. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A generalized method for searching databases in either retrospective or dissemination mode, the method comprising the steps of:
-
inputting a first data stream into a pipeline of search cells, each having a plurality of registers, with corresponding registers of each cell being serially connected to form the pipeline, wherein the first data stream contains a sequence of database characters and parallel sequences of other data associated with the database, one of the parallel sequences being a sequence of character marks; inputting a second data stream into the pipeline of search cells, although not necessarily after input of the first data stream, wherein the second data stream contains a sequence of pattern characters and parallel sequences of other data, including coded data to control operation of the cells, and wherein one of the first and second data streams is shifted into the pipeline prior to a search and the other of the data streams is streamed through the pipeline in performance of the search; setting all of the character marks to a nonzero value, to mark all of the characters in the database character sequence and to indicate initially that every character is a candidate for the beginning of a search; comparing each marked database character with a first pattern character; unmarking all characters that do not match the first pattern character; shifting any remaining marks toward the end of the sequence of database characters; comparing each marked database character with subsequent pattern characters in turn; and after each comparison, unmarking any nonmatching database characters and shifting any remaining marks toward the end of the sequence of database characters; whereby any string of characters in the database that matches the pattern will be marked at the character position next following the last character in the matching string. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. High speed data searching apparatus, comprising:
-
a plurality of functionally identical search cells, wherein each search cell includes a plurality of registers; means for connecting corresponding registers in each cell together to form a number of serially connected pipelines of registers through which data can be passed; means for inputting into the apparatus a first data stream from a database to be searched; and means for inputting into the apparatus a second data stream defining a pattern to be searched for in the database; wherein each search cell includes means for comparing a database character with a pattern character and means for controlling further operation of the cell in response to the result of each such comparison; and wherein some of the registers in each cell are designated to hold a database character and associated data, and others of the registers are designated to hold a pattern character and associated data; and wherein either one of the data streams is first stored in the designated registers of the search cells, and then the other of the data streams is streamed through the search cells to perform the search; and wherein the first data stream includes a sequence of database characters and parallel sequences of associated data, one of the parallel sequences being a sequence of character marks, and wherein a character mark of a particular state indicates that the corresponding database character is to be compared with a pattern character in any cell in which both characters are present. - View Dependent Claims (26, 27, 28)
-
-
29. High speed data searching apparatus, comprising:
-
a plurality of functionally identical search cells, wherein each search cell includes a plurality of registers; means for connecting corresponding registers in each cell together to form a number of serially connected pipelines of registers through which data can be passed; means for inputting into the apparatus a first data stream from a database to be searched; and means for inputting into the apparatus a second data stream defining a pattern to be searched for in the database; wherein each search cell includes means for comparing a database character with a pattern character and means for controlling further operation of the cell in response to the result of each such comparison; and wherein some of the registers in each cell are designated to hold a database character and associated data, and others of the registers are designated to hold a pattern character and associated data; and wherein either one of the data streams is first stored in the designated registers of the search cells, and then the other of the data streams is streamed through the search cells to perform the search; and
whereinthe first data stream includes a sequence of database characters and parallel sequences of associated data, one of the parallel sequences being a sequence of character marks; the second data stream includes a sequence of pattern characters and parallel sequences of coded data that determine the manner in which comparisons and other functions are performed in each element of the pattern; and a character mark of a particular state indicates that the corresponding database character is to be compared with a pattern character in any cell in which both characters are present. - View Dependent Claims (30, 31, 32, 33, 34, 35, 36, 37, 38)
-
-
39. A generalized method for searching databases, the method comprising the steps of:
-
inputting a first data stream into a pipeline of search cells, each having a plurality of registers, with corresponding registers of each cell being serially connected to form the pipeline, wherein the first data stream contains a sequence of database characters and parallel sequences of other data associated with the database, one of the parallel sequences being a sequence of character marks; inputting a second data stream into the pipeline of search cells, although not necessarily after input of the first data stream, wherein the second data stream contains a sequence of pattern characters and parallel sequences of other data, including coded data to control operation of the cells, and wherein one of the first and second data streams is shifted into the pipeline prior to a search and the other of the data streams is streamed through the pipeline in performance of the search; setting all of the character marks to a nonzero value, to mark all of the characters in the database character sequence and to indicate initially that every character is a candidate for the beginning of a search; comparing each marked database character with a first pattern character; unmarking all characters that do not match the first pattern character; shifting any remaining marks toward the end of the sequence of database characters; comparing each marked database character with subsequent pattern characters in turn; and after each comparison, unmarking any nonmatching database characters and shifting any remaining marks toward the end of the sequence of database characters; whereby any string of characters in the database that matches the pattern will be marked at the character position next following the last character in the matching string. - View Dependent Claims (40, 41, 42, 43, 44, 45, 46, 47, 48)
-
Specification