Grammar fragment acquisition using syntactic and semantic clustering
First Claim
Patent Images
1. A method comprising:
- selecting candidate multi-word phrases from a set of words, wherein a maximum number of candidate multi-word phrases is based on a highest of a number of preceding contexts;
for each candidate multi-word phrase in the candidate multi-word phrases, generating a measurement associated with a succeeding context of a succeeding phrase and a preceding context of a preceding phrase using a similarity in the candidate multi-word phrases;
clustering, via a processor, the candidate multi-word phrases into a grammar fragment based on the measurement, wherein the grammar fragment represents similar phrases that are both syntactically and semantically coherent; and
recognizing input speech using the grammar fragment.
4 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus are provided for automatically acquiring grammar fragments for recognizing and understanding fluently spoken language. Grammar fragments representing a set of syntactically and semantically similar phrases may be generated using three probability distributions: of succeeding words, of preceding words, and of associated call-types. The similarity between phrases may be measured by applying Kullback-Leibler distance to these tree probability distributions. Phrases being close in all three distances may be clustered into a grammar fragment.
-
Citations
20 Claims
-
1. A method comprising:
-
selecting candidate multi-word phrases from a set of words, wherein a maximum number of candidate multi-word phrases is based on a highest of a number of preceding contexts; for each candidate multi-word phrase in the candidate multi-word phrases, generating a measurement associated with a succeeding context of a succeeding phrase and a preceding context of a preceding phrase using a similarity in the candidate multi-word phrases; clustering, via a processor, the candidate multi-word phrases into a grammar fragment based on the measurement, wherein the grammar fragment represents similar phrases that are both syntactically and semantically coherent; and recognizing input speech using the grammar fragment. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system comprising:
-
a processor; and a computer-readable storage medium having instruction stored which, when executed by the processor, cause the processor to perform operations comprising; selecting candidate multi-word phrases from a set of words, wherein a maximum number of candidate multi-word phrases is based on a highest of a number of preceding contexts; for each candidate multi-word phrase in the candidate multi-word phrases, generating a measurement associated with a succeeding context of a succeeding phrase and a preceding context of a preceding phrase using a similarity in the candidate multi-word phrases; clustering, via a processor, the candidate multi-word phrases into a grammar fragment based on the measurement, wherein the grammar fragment represents similar phrases that are both syntactically and semantically coherent; and recognizing input speech using the grammar fragment. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer-readable storage device having instructions stored which, when executed by a computing device, causes the computing device to perform operations comprising:
-
selecting candidate multi-word phrases from a set of words, wherein a maximum number of candidate multi-word phrases is based on a highest of a number of preceding contexts; for each candidate multi-word phrase in the candidate multi-word phrases, generating a measurement associated with a succeeding context of a succeeding phrase and a preceding context of a preceding phrase using a similarity in the candidate multi-word phrases; clustering, via a processor, the candidate multi-word phrases into a grammar fragment based on the measurement, wherein the grammar fragment represents similar phrases that are both syntactically and semantically coherent; and recognizing input speech using the grammar fragment. - View Dependent Claims (20)
-
Specification