Generating speech recognition grammars from a large corpus of data
First Claim
1. A method of generating an expandable speech recognition grammar for use with a speech recognition engine comprising:
- parsing a corpus of data using a processor to generate an annotated corpus of data identifying grammatical structures and grammatical parts of speech within the corpus of data, wherein the corpus of data comprises a plurality of well formed sentences, and wherein the parsing comprises providing for each identified grammatical structure and grammatical part of speech a tag labeling each identified grammatical structure and grammatical part of speech accordingly;
using the processor to compare the identified grammatical structures and the identified grammatical parts of speech within the annotated corpus of data with grammar generation rules to designate particular ones of the identified grammatical structures and the identified grammatical parts of speech to include within a speech recognition grammar to be generated, wherein the grammar generation rules further designate, independently of a context of the corpus of data and any words already included in the expandable speech grammar, particular grammatical parts of speech and grammatical structures to be included within the expandable speech recognition grammar to be generated; and
using the processor to include within the expandable speech recognition grammar one or more words associated with the grammatical structures within the annotated corpus of data which have been identified in said parsing step and designated in said comparing step by the grammar generation rules, exclusive of words already included in the expandable speech recognition grammar, wherein the grammar is generated without use of counter-examples associated with the corpus of data.
2 Assignments
0 Petitions
Accused Products
Abstract
A method of generating a speech recognition grammar for use with a speech recognition system can include parsing the corpus of data to identify grammatical structures within the corpus of data. The identified grammatical structures can be compared with grammar generation rules to determine particular ones of the identified grammatical structures to include within the speech recognition grammar. The grammar generation rules can designate which grammatical structures are to be included within the speech recognition grammar. The grammatical structures which have been identified in the parsing step and which also have been designated by the grammar generation rules can be included in the speech recognition grammar.
40 Citations
4 Claims
-
1. A method of generating an expandable speech recognition grammar for use with a speech recognition engine comprising:
-
parsing a corpus of data using a processor to generate an annotated corpus of data identifying grammatical structures and grammatical parts of speech within the corpus of data, wherein the corpus of data comprises a plurality of well formed sentences, and wherein the parsing comprises providing for each identified grammatical structure and grammatical part of speech a tag labeling each identified grammatical structure and grammatical part of speech accordingly; using the processor to compare the identified grammatical structures and the identified grammatical parts of speech within the annotated corpus of data with grammar generation rules to designate particular ones of the identified grammatical structures and the identified grammatical parts of speech to include within a speech recognition grammar to be generated, wherein the grammar generation rules further designate, independently of a context of the corpus of data and any words already included in the expandable speech grammar, particular grammatical parts of speech and grammatical structures to be included within the expandable speech recognition grammar to be generated; and using the processor to include within the expandable speech recognition grammar one or more words associated with the grammatical structures within the annotated corpus of data which have been identified in said parsing step and designated in said comparing step by the grammar generation rules, exclusive of words already included in the expandable speech recognition grammar, wherein the grammar is generated without use of counter-examples associated with the corpus of data. - View Dependent Claims (2, 3, 4)
-
Specification