Method, device and system for part-of-speech disambiguation
First Claim
1. A method for providing unambiguous part-of-speech tags to text tokens in an input text, comprising the steps of:
- A) obtaining a set of probabilistically annotated tags for a text token;
B) determining a locally predicted tag for the text token based on a local context of at least one tag of another text token adjacent to the text token and determining an alternative tag for the text token based on an expanded context of the text token consisting of tags, features and boundaries; and
C) choosing, utilizing a discriminator, between the locally predicted tag and the alternative tag when the locally predicted tag and the alternative tag are different.
1 Assignment
0 Petitions
Accused Products
Abstract
A method (300), device (408), and system (400) provide part-of-speech disambiguation for words based on hybrid neural-network and stochastic processing. The method disambiguates the part-of-speech tags of text tokens by obtaining a set of probabilistically annotated tags for each text token, determining a locally predicted tag for each text token based on the local context of the text token, determining an alternative tag for each text token based on the expanded context of the text token, and choosing between the locally predicted tag and the alternative tag when the locally predicted tag and the alternative tag are different.
-
Citations
11 Claims
-
1. A method for providing unambiguous part-of-speech tags to text tokens in an input text, comprising the steps of:
-
A) obtaining a set of probabilistically annotated tags for a text token;
B) determining a locally predicted tag for the text token based on a local context of at least one tag of another text token adjacent to the text token and determining an alternative tag for the text token based on an expanded context of the text token consisting of tags, features and boundaries; and
C) choosing, utilizing a discriminator, between the locally predicted tag and the alternative tag when the locally predicted tag and the alternative tag are different. - View Dependent Claims (2, 3, 4, 5)
A) a stochastic algorithm for part-of-speech disambiguation based on local context;
B) a system of rules for part-of-speech disambiguation based on local context;
C) a neural network trained to disambiguate parts-of-speech based on local context;
D) a decision tree for part-of-speech disambiguation based on local context;
E) a genetic algorithm for part-of-speech disambiguation based on local context; and
F) a combination of at least two of A-E.
-
-
4. The method of claim 1 wherein the alternative tag is determined by using one of:
-
A) a system of rules for disambiguating parts-of-speech based on expanded context;
B) a neural network trained to disambiguate parts-of-speech based on expanded context;
C) a decision tree for part-of-speech disambiguation based on expanded context;
D) a genetic algorithm for part-of-speech disambiguation based on expanded context; and
E) a combination of at least two of A-D.
-
-
5. The method of claim 1 wherein the choice between the locally determined tag and the alternative tag is determined by using one of:
-
A) a system of rules to discriminate between tags based on observed characteristics of a local-context tagger and an expanded-context tagger;
B) a neural network trained to discriminate between tags based on observed characteristics of the local-context tagger and the expanded-context tagger;
C) a decision tree trained to discriminate between tags based on observed characteristics of the local-context tagger and the expanded-context tagger;
D) a genetic algorithm trained to discriminate between tags based on observed characteristics of the local-context tagger and the expanded-context tagger; and
E) a combination of at least two of A-D.
-
-
6. An article of manufacture/computer program/computer/speech synthesizer for disambiguating the parts-of-speech of text tokens, having a computer usable medium with a computer readable program code thereon wherein the computer readable program code implements the steps of:
-
A) determining a locally predicted tag for a text token based on a local context of at least one tag of another text token adjacent to the text token;
B) determining an alternative tag for the text token based on an expanded context of the text token consisting of tags, features and boundaries; and
C) choosing, using a discriminator routine, between the locally predicted tag and the alternative tag for the text token, when A locally predicted tag and the alternative tag are different. - View Dependent Claims (7, 8)
A) using a local-context routine, coupled to receive a sequence of probabilistically tagged text tokens, for determining the local context of the text token;
B) using a tag-context knowledge database, coupled to receive a sequence of tags in context, for determining tag-context probabilities; and
C) using a tag-context disambiguator, coupled to the local-context routine and the tag-context knowledge database, for determining the locally predicted tag for the text token based on the local context of the text token.
-
-
8. The article of manufacture/computer program/computer/speech synthesizer of claim 6 wherein choosing between a locally predicted tag and an alternative tag for the text token includes:
-
A) using a tag-stream controller routine, coupled to receive a plurality of tag streams, for producing a sequence of unambiguously tagged text tokens; and
B) using a tag-discrimination knowledge database, coupled to the tag-stream controller routine, having a plurality of systems/computer readable codes/circuits for discriminating between tags provided by the plurality of tag streams.
-
-
9. An article of manufacture/application specific integrated circuit/microprocessor for disambiguating the parts-of-speech of text tokens, comprising:
-
A) a local-context tagger, coupled to receive a sequence of probabilistically tagged text tokens, for determining a locally predicted tag for a text token based on a local context of at least one tag of another text token adjacent to the text token;
B) an expanded-context tagger, coupled to receive the sequence of probabilistically tagged text tokens, for determining an alternative tag for the text token based on an expanded context of the text token consisting of tags, features and boundaries; and
C) a discriminator, coupled to the local-context tagger and the expanded-context tagger, for choosing between the locally predicted tag and the alternative tag for the text token, when the locally predicted tag and the alternative tag are different. - View Dependent Claims (10, 11)
A) a local-context processor, coupled to receive a sequence of probabilistically tagged text tokens, for determining the local context of the text token;
B) a tag-context knowledge database, coupled to receive a sequence of tags in context, for determining tag-context probabilities; and
C) a tag-context disambiguator, coupled to the local-context processor and the tag-context knowledge database, for determining the locally predicted tag for the text token based on the local context of the text token.
-
-
11. The article of manufacture/application specific integrated circuit/microprocessor of claim 9 wherein the discriminator comprises:
-
A) a tag-stream controller, coupled to receive a plurality of tag streams, for producing a sequence of unambiguously tagged text tokens; and
B) a tag-discrimination knowledge database, coupled to the tag-stream controller, having a plurality of systems/computer readable codes/circuits for discriminating between tags provided by the plurality of tag streams.
-
Specification