System and method of extracting clauses for spoken language understanding
First Claim
1. A method comprising:
- inserting, via a discriminative classification approach, boundary tags into speech utterance text, the boundary tags identifying boundaries selected from a group comprising phrase boundaries, sentence boundaries, and paragraph boundaries, wherein the discriminative classification approach utilizes syntactic features before and after each word being tagged, to yield boundary marked speech utterance text and unedited text;
identifying, via a processor, a coordinating conjunction within the unedited text based on a conjunction tag, wherein the conjunction tag comprises conjunction span information indicating how many words to the left of the conjunction tag a corresponding conjunction includes; and
identifying clauses in the speech utterance text based on the boundary marked speech utterance text and the coordinating conjunction.
4 Assignments
0 Petitions
Accused Products
Abstract
A clausifier and method of extracting clauses for spoken language understanding are disclosed. The method relates to generating a set of clauses from speech utterance text and comprises inserting at least one boundary tag in speech utterance text related to sentence boundaries, inserting at least one edit tag indicating a portion of the speech utterance text to remove, and inserting at least one conjunction tag within the speech utterance text. The result is a set of clauses that may be identified within the speech utterance text according to the inserted at least one boundary tag, at least one edit tag and at least one conjunction tag. The disclosed clausifier comprises a sentence boundary classifier, an edit detector classifier, and a conjunction detector classifier. The clausifier may comprise a single classifier or a plurality of classifiers to perform the steps of identifying sentence boundaries, editing text, and identifying conjunctions within the text.
29 Citations
17 Claims
-
1. A method comprising:
-
inserting, via a discriminative classification approach, boundary tags into speech utterance text, the boundary tags identifying boundaries selected from a group comprising phrase boundaries, sentence boundaries, and paragraph boundaries, wherein the discriminative classification approach utilizes syntactic features before and after each word being tagged, to yield boundary marked speech utterance text and unedited text; identifying, via a processor, a coordinating conjunction within the unedited text based on a conjunction tag, wherein the conjunction tag comprises conjunction span information indicating how many words to the left of the conjunction tag a corresponding conjunction includes; and identifying clauses in the speech utterance text based on the boundary marked speech utterance text and the coordinating conjunction. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system comprising:
-
a processor; and a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising; inserting, via a discriminative classification approach, boundary tags into speech utterance text, the boundary tags identifying boundaries selected from a group comprising phrase boundaries, sentence boundaries, and paragraph boundaries, wherein the discriminative classification approach utilizes syntactic features before and after each word being tagged, to yield boundary marked speech utterance text and unedited text; identifying a coordinating conjunction within the unedited text based on a conjunction tag, wherein the conjunction tag comprises conjunction span information indicating how many words to the left of the conjunction tag a corresponding conjunction includes; and identifying clauses in the speech utterance text based on the boundary marked speech utterance text and the coordinating conjunction. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising:
-
inserting, via a discriminative classification approach, boundary tags into speech utterance text, the boundary tags identifying boundaries selected from a group comprising phrase boundaries, sentence boundaries, and paragraph boundaries, wherein the discriminative classification approach utilizes syntactic features before and after each word being tagged, to yield boundary marked speech utterance text and unedited text; identifying a coordinating conjunction within the unedited text based on a conjunction tag, wherein the conjunction tag comprises conjunction span information indicating how many words to the left of the conjunction tag a corresponding conjunction includes; and identifying clauses in the speech utterance text based on the boundary marked speech utterance text and the coordinating conjunction. - View Dependent Claims (14, 15, 16, 17)
-
Specification