Natural language processing of disfluent sentences
First Claim
Patent Images
1. A method for processing spoken language comprising:
- converting spoken words into a text word sequence;
tagging words in the text word sequence with part-of-speech (POS) tags;
tagging edited words in the text word sequence using a disfluence identifier that operates with a feature set created with techniques comprising;
matching only the highest level POS tags in a multi-level hierarchy of such tags; and
, parsing the text word sequence into machine instructions with the aid of POS-tag and edited-word-tag information.
1 Assignment
0 Petitions
Accused Products
Abstract
An advanced model that includes new processes is provided for use as a component of an effective disfluency identifier. The disfluency identifier tags edited words in transcribed speech. A speech recognition unit in combination with a part-of-speech tagger, a disfluency identifier, and a parser form a natural language system that helps machines properly interpret spoken utterances.
-
Citations
19 Claims
-
1. A method for processing spoken language comprising:
-
converting spoken words into a text word sequence;
tagging words in the text word sequence with part-of-speech (POS) tags;
tagging edited words in the text word sequence using a disfluence identifier that operates with a feature set created with techniques comprising;
matching only the highest level POS tags in a multi-level hierarchy of such tags; and
,parsing the text word sequence into machine instructions with the aid of POS-tag and edited-word-tag information. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system for processing spoken language comprising:
-
a speech recognition unit for converting spoken words into a text word sequence;
a part-of-speech (POS) tagger for tagging words in the text word sequence with part-of-speech tags;
a disfluence identifier for tagging edited words in the text word sequence;
wherein, the disfluence identifier operates with a feature set created with techniques comprising;
matching only the highest level POS tags in a multi-level hierarchy of such tags; and
,a parser for parsing the text word sequence into machine instructions with the aid of POS-tag and edited-word-tag information. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A method for creating a disfluence identifier model comprising:
-
analyzing the distribution of speech repairs in transcribed speech;
choosing conditioning variables commensurate with the distribution of speech repairs;
using a rough copy identifier with the conditioning variables to generate a feature set; and
,weighting the feature set according to an iterative algorithm run on training data. - View Dependent Claims (17, 18, 19)
-
Specification