Information analysis and method
First Claim
1. An information analysis apparatus, which connects after a speech recognizer for recognizing a user'"'"'s speech input sentence and for generating a word-lattice linking a plurality of words from a start node to an end node as recognition candidates of the speech input sentence, for accepting the word-lattice as input and for generating a set of acceptable word sequences by referring to a word-class dictionary that matches each word to corresponding word-class and to a grammar that matches each word-class sequence to corresponding sentence type, comprising:
- a hash dictionary means for storing a plurality of identifiers of word-class sequences each of which represents a sentence by unit of the word-class and word appearance number, the word-class being positioned at the word appearance number in the word-class sequence;
an initialization means for forming a node for each word in the word-lattice, the node including the word-class, an interprocessing list, lists of next nodes and unprocessed antecedent nodes in the word-lattice, the interprocessing list of the node directly linked from the start node represents the identifiers of word-class sequences for corresponding word-class and the word appearance number “
1”
in said hash dictionary means, and for forming a list of processing nodes representing the nodes directly linked from the start node;
a propagation means for extracting one node from the list of processing nodes if the list of unprocessed antecedent nodes of the one node is empty, for extracting each next node of the one node from the list of next nodes if the list of next nodes is not empty, for retrieving the identifiers of word-class sequences from said hash dictionary means by the word appearance number as link order and the word-class of the each next node, for respectively calculating a product of retrieved identifiers of the each next node and the identifiers in the interprocessing list of the one node, for storing the product as propagated identifiers in the interprocessing list of the each next node, for deleting the one node from the list of unprocessed antecedent nodes of the each next node and from the list of processing nodes, and for adding the each next node in the list of processing nodes;
repeat means for repeating process of said propagation means untill the list of processing nodes is empty; and
word sequence extraction means for extracting the propagated identifiers of the end node if the list of processing nodes is empty, and for extracting word sequences corresponding to the word-class sequences of the propagated identifiers from the word-lattice.
1 Assignment
0 Petitions
Accused Products
Abstract
A hash dictionary stores identifiers of word-class sequences by unit of the word-class and appearance number. A node of each word in word-lattice includes the word-class, an interprocessing list, lists of next nodes and unprocessed antecedent nodes. The interprocessing list of the node directly linked from a start node stores the identifiers of word-class sequences for the word-class and the appearance number “1” in the hash dictionary. A list of processing nodes stores the nodes directly linked from the start node. A propagation section extracts one node from the list of processing nodes, extracts each next node of the one node from the list of next nodes, retrieves the identifiers of word-class sequences from the hash dictionary by the appearance number and the word-class of the each next node, respectively calculates a product of retrieved identifiers of the each next node and the identifiers in the interprocessing list of the one node, stores the product in the interprocessing list of the each next node, deletes the one node from the list of processing nodes, and adds the each next node in the list of processing nodes. This propagation process is repeated untill the list of processing nodes is empty.
69 Citations
20 Claims
-
1. An information analysis apparatus, which connects after a speech recognizer for recognizing a user'"'"'s speech input sentence and for generating a word-lattice linking a plurality of words from a start node to an end node as recognition candidates of the speech input sentence, for accepting the word-lattice as input and for generating a set of acceptable word sequences by referring to a word-class dictionary that matches each word to corresponding word-class and to a grammar that matches each word-class sequence to corresponding sentence type, comprising:
-
a hash dictionary means for storing a plurality of identifiers of word-class sequences each of which represents a sentence by unit of the word-class and word appearance number, the word-class being positioned at the word appearance number in the word-class sequence;
an initialization means for forming a node for each word in the word-lattice, the node including the word-class, an interprocessing list, lists of next nodes and unprocessed antecedent nodes in the word-lattice, the interprocessing list of the node directly linked from the start node represents the identifiers of word-class sequences for corresponding word-class and the word appearance number “
1”
in said hash dictionary means, and for forming a list of processing nodes representing the nodes directly linked from the start node;
a propagation means for extracting one node from the list of processing nodes if the list of unprocessed antecedent nodes of the one node is empty, for extracting each next node of the one node from the list of next nodes if the list of next nodes is not empty, for retrieving the identifiers of word-class sequences from said hash dictionary means by the word appearance number as link order and the word-class of the each next node, for respectively calculating a product of retrieved identifiers of the each next node and the identifiers in the interprocessing list of the one node, for storing the product as propagated identifiers in the interprocessing list of the each next node, for deleting the one node from the list of unprocessed antecedent nodes of the each next node and from the list of processing nodes, and for adding the each next node in the list of processing nodes;
repeat means for repeating process of said propagation means untill the list of processing nodes is empty; and
word sequence extraction means for extracting the propagated identifiers of the end node if the list of processing nodes is empty, and for extracting word sequences corresponding to the word-class sequences of the propagated identifiers from the word-lattice. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
wherein said hash dictionary means derives from the grammar that stores a representative of the word-class sequence by unit of acceptable sentence-type. -
3. The information analysis apparatus according to claim 2,
wherein the representative of the word-class sequence is expanded to a plurality of word-class sequences by unit of acceptable sentence-type as patterns of word-class sequences. -
4. The information analysis apparatus according to claim 3,
wherein the identifier of the pattern of the word-class sequence is stored by unit of the word-class and the word appearance number in said hash dictionary means if the word-class is positioned at the word appearance number in the pattern of word-class sequence. -
5. The information analysis apparatus according to claim 1,
wherein the node further includes a word identifier as the recognition candidate and a list of antecedent nodes, and wherein the list of antecedent nodes represents the word identifier which directly links to a particular node in the word-lattice, and the list of next nodes represents the word identifiers which directly links from the particular node in the word-lattice. -
6. The information analysis apparatus according to claim 5,
wherein said initialization means copies the list of antecedent nodes to the list of unprocessed antecedent nodes for each node at initialization mode. -
7. The information analysis apparatus according to claim 6,
wherein the interprocessing list of each node except for the node directly linked from the start node is empty at the initialization mode. -
8. The information analysis apparatus according to claim 1,
wherein the link order is a number of words from the start node to the next node in the word-lattice and is used as the word appearance number to retrieve the identifiers of word-class sequences from said hash dictionary means. -
9. The information analysis apparatus according to claim 1,
wherein said propagation means calculates the product as common identifier included in both the retrieved identifiers of the next node and the identifiers in the interprocessing list of the one node.
-
-
10. An information analysis method, as a post-processing of a speech recognizer for recognizing a user'"'"'s speech input sentence and for generating a word-lattice linking a plurality of words from a start node to an end node as recognition candidates of the speech input sentence, for accepting the word-lattice as input and for generating a set of acceptable word sequences by referring to a word-class dictionary that matches each word to corresponding word-class and to a grammar that matches each word-class sequence to corresponding sentence type, comprising the steps of:
-
storing a plurality of identifiers of word-class sequences each of which represents a sentence by unit of the word-class and word appearance number in a hash dictionary, the word-class being positioned of the word appearance number in the word-class sequence;
forming a node for each word in the word-lattice, the node including the word-class, an interprocessing list, lists of next nodes and unprocessed antecedent nodes in the word-lattice, the interprocessing list of the node directly linked from the start node represents the identifiers of word-class sequences for corresponding word-class and the word appearance number “
1”
in said hash dictionary;
forming a list of processing nodes representing the nodes directly from the start node;
extracting one node from the list of processing nodes if the list of unprocessed antecedent nodes of the one node is empty;
extracting each next node of the one node from the list of next nodes if the list of next nodes is not empty;
retrieving the identifiers of word-class sequences from said hash dictionary by the word appearance number as link order and the word-class of the each next node;
respectively calculating a product of retrieved identifiers of the each next node and the identifiers in the interprocessing list of the one node;
storing the product as propagated identifiers in the interprocessing list of the each next node;
deleting the one node from the list of unprocessed antecedent nodes of the each next node and from the list of processing nodes;
adding the each next node in the list of processing nodes, repeating the one node-extracting step, each next node-extracting step, retrieving step, calculating step, storing step, deleting step and adding step untill the list of processing nodes is empty;
extracting the propagated identifiers of the end node if the list of processing nodes is empty; and
extracting word sequences corresponding to the word-class sequences of the propagated identifiers from the word-lattice.
-
-
11. A computer-readable memory, as a post-processing of a speech recognizer for recognizing a user'"'"'s speech input sentence and for generating a word-lattice linking a plurality of words from a start node to an end node as recognition candidates of the speech input sentence, for accepting the word-lattice as input and for generating a set of acceptable word sequences by referring to a word-class dictionary that matches each word to corresponding word-class and to a grammar that matches each word-class sequence to corresponding sentence type, comprising:
-
instruction means for causing a computer to store a plurality of identifiers of word-class sequences each of which represents a sentence by unit of the word-class and word appearance number in a hash dictionary, the word-class being positioned at the word appearance number in the word-class sequence;
instruction means for causing a computer to form a node for each word in the word-lattice, the node including the word-class, an interprocessing list, lists of next nodes and unprocessed antecedent nodes in the word-lattice, the interprocessing list of the node directly linked from the start node represents the identifiers of word-class sequences for corresponding word-class and the word appearance number “
1”
in said hash dictionary;
instruction means for causing a computer to form a list of processing nodes representing the nodes directly from the start node;
instruction means for causing a computer to extract one node from the list of processing nodes if the list of inprocessed antecedent nodes of the one node is empty;
instruction means for causing a computer to extract each next node of the one node from the list of next nodes if the list of next nodes is not empty;
instruction means for causing a computer to retrieve the identifiers of word-class sequences from said hash dictionary by the word appearance number as lind order and the word-class of the each next node;
instruction means for causing a computer to respectively calculate a product of retrieved identifiers of the each next node and the identifiers in the interprocessing list of the one node;
instruction means for causing a computer to store the product as propagated identifiers in the interprocessing list of the each next node;
instruction means for causing a computer to delete the one node from the list unprocessed antecedent nodes of the each next node and from the list of processing nodes;
instruction means for causing a computer to add the each next node in the list of processing nodes;
instruction means for causing a computer repeat the one node-extracting step, each next node-extracting step, retrieving step, calculating step, storing step, deleting step and adding step untill the list of processing nodes is empty;
instruction means for causing a computer to extact the propagated identifiers of the end node if the list of processing nodes is empty; and
instruction means for causing a computer to extract word sequences corresponding to the word-class sequences of the propagated identifiers from the word-lattice.
-
-
12. An information analysis apparatus, which connects after a speech recognizer for recognizing a user'"'"'s speech input sentence and for generating a word-lattice linking a plurality of words from a start node to an end node as. recognition candidates of the speech input sentence, for accepting the word-lattice as input and for generating a set of acceptable word sequences by referring to a word-class dictionary that matches each word to corresponding word-class and to a grammar that matches each word-class sequence to corresponding sentence type, comprising:
-
a hierarchical hash dictionary means for storing a plurality of identifiers of partial word-class sequences each of which represents a phrase in a sentence by unit of the word-class and word appearance number, the word-class being positioned at the word appearance number in the partial word-class sequence, and for storing a plurality of identifiers of word-class sequences each of which represents the sentence by unit of the word-class, the phrase and appearance number, the word-class or the phrase being positioned at the appearance number in the sentence;
an initialization means for forming a node for each word in the word-lattice, the node including the word-class, an interprocessing list, lists of next nodes and unprocessed antecedent nodes in the word-lattice, the interprocessing list of the node directly linked from the start node represents the identifiers of partial word-class sequences for corresponding word-class and the word appearance number “
1”
in said hierarchical hash dictionary means, and for forming a list of processing nodes representing the nodes directly linked from the start node;
first propagation means for extracting one node from the list of processing nodes if the list of the unprocessed antecedent nodes of the one node is empty, for extracting each next node of the one node from the list of next nodes if the list of next node is not empty, for retrieving the identifiers of partial word-class sequences from said hierarchical hash dictionary means by the word appearance number as link order and the word-class of the each next node, for respectively calculating a product of retrieved identifiers of the each next node and the identifiers in the interprocessing list of the one node, for storing the product in the interprocessing list of the each next node, for deleting the one node from the list of unprocessed antecedent nodes of the each next node and from the list of processing nodes, and for adding the each next node in the list of processing nodes;
first repeat means for repeating process of said first propagation means untill the list of processing nodes is empty;
partial word sequence extraction means for extracting a partial word sequence corresponding to the partial word-class sequence in the word-lattice if the list of processing nodes is empty and the partial word-class sequence is identified in the word-lattice;
node addition means for adding a phrase node as the partial word sequence in the word-lattice, the phrase node including an interprocessing list, lists of next nodes and unprocessed antecedent nodes in the word-lattice, and for initializing the interprocessing list of each node and the list of processing nodes, the interprocessing list of the node and the phrase node directly linked from the start node represents the identifiers of word-class sequences for corresponding word-class and the appearance number “
1”
in said hierarchical hash dictionary means, the list of processing nodes represents the node and the phrase node directly linked from the start node;
second repeat means for repeating process of said first propagation means, first repeat means, partial word sequence extraction means and node addition means for other phrase if a plurality of phrases are defined in said hierarchical hash dictionary means and the other phrase does not include a phrase not added in the word-lattice;
second propagation means for extracting one node from the list of processing nodes if the list of the unprocessed antecedent nodes of the one node is empty, for extracting each next node of the one node from the list of next nodes if the list of next node is not empty, for retrieving the identifiers of word-class sequences from said hierarchical hash dictionary means by the appearance number as link order and the word-class of the each next node, for respectively calculating a product of retrieved identifiers. of the each next node and the identifiers in the interprocessing list of the one node, for storing the product as propagated identifiers in the interprocessing list of the each next node, for deleting the one node from the list of unprocessed antecedent nodes of the each next node and from the list of processing nodes, and for adding the each next node in the list of processing nodes;
third repeat means for repeating process of said second propagation means untill the list of processing nodes is empty; and
word sequence extraction means for extracting the propagated identifiers of the end node if the list of processing nodes is empty, and for extracting word sequences corresponding to the word-class sequences of the propagated identifiers from the word-class. - View Dependent Claims (13, 14, 15, 16, 17, 18)
wherein said hierarchical hash dictionary means derives from the grammar which stores a representative of the word-class sequence by unit of acceptable sentence-type. -
14. The information analysis apparatus according to claim 13,
wherein the partial word-class sequence is commonly extractred as the phrase from a plurality of representatives of the word-class sequences, a pattern of the partial word-class sequence is registered with an identifier, and the partial word-class sequence in the plurality of representatives is replaced by the phrase. -
15. The information analysis apparatus according to claim 14,
wherein the identifier of the partial word-class sequence is stored by unit of the word-class and the word appearance number as the phrase pattern in said hierarchical hash dictionary means if the word-class is positioned at the word appearance number in the phrase pattern, and the identifier of the word-class sequence is stored by unit of the word-class, the phrase and the appearance number as the sentence pattern in said hierarchical hash dictionary means if the word-class or the phrase is positioned at the appearance number in the sentence pattern. -
16. The information analysis apparatus according to claim 14,
further comprising a connectable word-class list in which neighboring two word-classes in the partial word-class sequence are registered as connectable word-classes in the phrase, and neighboring word-class and phrase or neighboring two word-classes in the word-class sequence are registered as connectable word-classes in the sentence. -
17. The information analysis apparatus according to claim 16,
wherein the plurality of words as the recognition candidates are linked from the start point to the end point by referring to an appearance position of each word in the speech input sentence and a connectable condition between two word-classes in said connectable word-class list. -
18. The information analysis apparatus according to claim 17,
wherein the phrase node as the partial word sequence is additionally linked in the word-lattice by referring to an appearance position of the partial word sequence and a connectable condition between the phrase and neighboring word-class in said connectable word-class list.
-
-
19. An information analysis method, as a post-processing of a speech recognizer for recognizing a user'"'"'s speech input sentence and for generating a word-lattice linking a plurality of words from a start node to an end node as recognition candidates of the speech input sentence, for accepting the word-lattice as input and for generating a set of acceptable word sequences by referring to a word-class dictionary that matches each word to corresponding word-class and to a grammar that matches each word-class sequence to corresponding sentence type, comprising the steps of:
-
storing a plurality of identifiers of partial word-class sequences each of which represents a phrase in a sentence by unit of the word-class and word appearance number in a hierarchical hash dictionary, the word-class being positioned at the word appearance number in the partial word-class sequence;
storing a pluratity of identifiers of word-class sequences each of which represents the sentence by unit of the word-class, the phrase and appearance unmber in the hierarchical hash dictionary, the word-class or the phrase being positioned at the appearance number in the sentence;
forming a node for each word in the word-lattice, the node including the word-class, an interprocessing list, lists of next nodes and unprocessed antecedent nodes in the word-lattice, the interprocessing list of the node directly linked from the start node represents the identifiers of partial word-class sequences for corresponding word-class and the word appearance number “
1”
in said hierarchical hash dictionary;
forming a list of processing nodes representing the nodes directly from the start node;
extracting one node from the list of processing nodes if the list of the unprocessed antecedent nodes of the one node is empty;
extracting each next node of the one node from the list of next nodes if the list of next nodes is not empty;
retrieving the identifiers of partial word-class sequences from said hierarchical hash dictionary by the word appearance number as link order and the word-class of the each next node;
respectively calculating a product of retrieved identifiers of the each next node and the identifiers in the interprocessing list of the one node;
storing the product in the interprocessing list of the each next node;
deleting the one node from the list of unprocessed antecedent nodes of the each next node and from the list processing nodes;
adding the each next node in the list of processing nodes;
repeating the one node-extracting step, each next node-extracting step, retrieving step, calculating step, storing step, deleting step and adding step untill the lisit of processing nodes is empty;
extracting a partial word sequence corresponding to the partial word-class sequence in the word-lattice if the list of processing nodes is empty and the partial word-class sequence is identified in the word-lattice;
adding a phrase node as the partial word sequence in the word-lattice, the phrase node including an interprocessing list, lists of next nodes and unprocessed antecedent nodes in the word-lattice;
initializing the interprocessing list of each node and the list of processing nodes, the interprocessing list of the node and the phrase node directly linked from the start node represents the identifiers of word-class sequences for corresponding word-class and the appearance number “
1”
in the hierarchical hash dictionary, the list of processing nodes represents the node and the phrase node directly linked from the start node;
repeating the one node-extracting step, each next node-extracting step, retrieving step, calculating step, storing step, deleting step, each next node-adding step, repeating step, partial word sequence-extracting step, phrase node-adding step and initializing step for other phrase if a plurality of phrases are defined in said hierarchical hash dictionary and the other phrase does not include a phrase not added in the word-lattice;
extracting one node from the list of processing nodes if the list of unprocessed antecedent nodes of the one node is empty;
extracting each next node of the one node from the list of next nodes if the list of next nodes is not empty;
retrieving the identifiers of word-class sequences from said hierarchical hash dictionary by the appearance number as link order and the word-class of the each next node;
respectively calculating a product of retrieved identifiers of the each next node and the identifiers in the interprocessing list of the one node;
storing the product as propagated identifiers in the interprocessing list of the each next node;
deleting the one node from the list of unprocessed antecedent nodes of the each next node and from the list of processing nodes;
adding the each next node in the list of processing nodes;
repeating the one node-extracting step, each next node-extracting step, retrieving step, calculating step, storing step, deleting step and adding step untill the list of processing nodes is empty;
extracting the propagated identifiers of the end node if the list of processing nodes is empty; and
extracting word sequences corresponding to the word-class sequences of the propagated identifiers from the word-lattice.
-
-
20. A computer readable memory, as a post-processing of a speech recognizer for recognizing a user'"'"'s speech input sentence and for generating a word-lattice linking a plurality of words from a start node to an end node as recognition candidates of the speech input sentence, for accepting the word-lattice as input and for generating a set of acceptable word sequences by referring to a word-class dictionary that matches each word to corresponding word-class and to a grammar that matches each word-class sequence to corresponding sentence type, comprising:
-
instruction means for causing a computer to store a plurality of identifiers of partial word-class sequences each of which represents a phrase in a sentence by unit of the word-class and word appearance number in a hierarchical hash dictionary, the word-class being positioned at the word appearance number in the partical word-class sequence;
instruction means for causing a computer to store a pluralitiy of identifiers of word-class sequences each of which represents the sentence by unit of the word-class, the phrase and appearance number in the hierarchical hash dictionary, the word-class or the phrase being positioned at the appearance number in the sentence;
instruction means for causing a computer to form a node for each word in the word-lattice, the node including the word-class, an interprocessing list, lists of next nodes and unprocessed antecedent nodes in the word-lattice, the interprocessing list of the node directly linked from the start node represents the identifiers of partial word-class sezwences for corresponding word-class and the word appearance number “
1”
in said hierarchical hash dictionary;
instruction means for causing a computer to form a list of processing nodes representing the nodes directly from the start node;
instruction means for causing a computer to extract one node from the list of processing nodes of the list of the unprocessed antecedent nodes of the one node is empty;
instruction means for causing a computer to extract each next node of the one node from the list of next nodes if the list of next nodes is not empty;
instruction means for causing a computer to retrieve the identifiers of partial word-class sequences from said hierarchical hash dictionary by the word appearance number as link order and the word-class of the each next node;
instruction means for causing a computer to respectively calculate a product of retrieved identifiers of the each next node and the identifiers in the interprocessing list of the one node;
instruction means for causing a computer to store the product in the interprocessing list of the each next node;
instruction means for causing a computer to delete the one node from the list of unprocessed antecedent nodes of the each next node and from the list of processing nodes;
instruction means for causing a computer to add the each next node in the list of processing nodes;
instruction means for causing a computer to repeat the one node-extracting step, each next node-extracting step, retrieving step, calculating step, storing step, deleting step and adding step untill the list of processing nodes is empty;
instruction means for causing a computer to extract a partial word sequence corresponding to the partial word-class sequence in the word-lattice if the list of processing nodes is empty and the partial word-class sequence is identified in the word-lattice;
instruction means for causing a computer to add a phrase node as the partial word sequence in the word-lattice, the phrase node including an interprocessing list, lists of next nodes and unprocessed antecedent nodes in the word-lattice;
instruction means for causing a computer to initialize the interprocessing list of each node and the list of processing nodes, the interprocessing list of the node and the phrase node directly linked from the start node represents the identifiers of word-class sequences for corresponding word-class and the appearance number “
1”
in the hierarchical hash dictionary, the list of processing nodes represents the node and the phrase node directly linked from the start node;
instruction means for causing a computer to repeat the one node-extracting step, each next node-extracting step, retrieving step, calculating step, storing step, deleting step, each next node-adding step, repeating step, partial word sequence-extracting step, phrase node-adding step and initializing step for other phrase if a plurality of phrases are defined in said hierarchical hash dictionary and the other phrase does not include a phrase not added in the word-lattice;
instruction means for causing a computer to extract one node from the list of processing nodes if the list of unprocessed antecedent nodes of the one node is empty;
instruction means for causing a computer to extract each next node of the one node from the list of next nodes if the list of next nodes is not empty;
instruction means for causing a computer to retrieve the identifiers of word-class sequences from said hierarchical hash dictionary by the appearance number as link order and the word-class of the each next node;
instruction means for causing a computer to respectively calculate a product of retrieved identifiers of the each next node and the identifiers in the interprocessing list of the one node;
instruction means for causing a computer to store the product as propagated identifiers in the interprocessing list of the each next node;
instruction means for causing a computer to delete the one node from the list of unprocessed antecedent nodes of the each next node and from the list of processing nodes;
instruction means for causing a computer add the each next node in the list of processing node;
instruction means for causing a computer to repeat the one node-extracting step, each next node-extracting step, retrieving step, calculating step, storing step, deleting step and adding step untill the list of processing nodes is empty;
instruction means for causing a computer to extract the propagated identifiers of the end node if the list of processing nodes is empty; and
instruction means for causing a computer to extract word sequences corresponding to the word-class sequences of the propagated identifiers from the word-lattice.
-
Specification