Language identification system and method for a peripheral unit
First Claim
1. In a data processing system responsive to a plurality of input languages, each language adhering to a prescribed syntax, the presence of defined data portions ("For" keys) in incoming data indicating a vote for the presence of a language and the presence of other defined data portions ("Against" keys) indicating a vote Against the presence of the language, a method for identifying an input language comprising the steps of:
- a) analyzing, for each expected language, a syntax of an incoming block of data to identify For and Against keys in said block of data;
b) providing For and Against tallies for each expected language in response to said analysis, each said tally being a summation of key entries, each key entry comprising an identified key count multiplied by a skew, the value of said skew indicating an importance of said key in said syntax and in the context of said block of data, said For tally summing entries of For keys and said Against tally summing entries of Against keys;
c) comparing said For and Against tallies to determine whether or not they are so close as to signal uncertainty and, based upon a further syntactical characteristic of said block of data, resolving said uncertainty and indicating a value based on one of said tallies, said indication dependent upon whether said further syntactical characteristic indicates For or Against the language;
d) indicating a value derived from the larger of the tallies in the event of no uncertainty between the tallies; and
e) deciding, based upon said indicated value for each said expected language, the identity of a received language.
2 Assignments
0 Petitions
Accused Products
Abstract
A data processing system is responsive to a plurality of input languages, each language adhering to a prescribed syntax. The presence of defined portions ("For" keys) in the incoming data indicate a vote for the presence of a language and the presence of other defined portions ("Against" keys) indicate a vote Against the presence of the language. The system performs the following method for identifying the input language: analyzing, for each expected language, the syntax of an incoming block of data to identify For and Against keys in the block of data; providing For and Against tallies for each expected language in response to the analysis, each tally being a summation of key entries, each key entry comprising an identified key count multiplied by a skew, a skew value indicating the importance of the key in the syntax and in the context of said block of data, the For tally summing entries of For keys and the Against tally summing entries of Against keys; comparing the For and Against tallies to determine whether or not they are so close as to signal uncertainty; and, based upon a further syntactical characteristic of block of data, resolving the uncertainty and indicating a value based on one of the tallies: indicating a value derived from the larger of the tallies in the event of no uncertainty between the tallies; and deciding, based upon the indicated value for each expected language, the identity of a received language.
-
Citations
21 Claims
-
1. In a data processing system responsive to a plurality of input languages, each language adhering to a prescribed syntax, the presence of defined data portions ("For" keys) in incoming data indicating a vote for the presence of a language and the presence of other defined data portions ("Against" keys) indicating a vote Against the presence of the language, a method for identifying an input language comprising the steps of:
-
a) analyzing, for each expected language, a syntax of an incoming block of data to identify For and Against keys in said block of data; b) providing For and Against tallies for each expected language in response to said analysis, each said tally being a summation of key entries, each key entry comprising an identified key count multiplied by a skew, the value of said skew indicating an importance of said key in said syntax and in the context of said block of data, said For tally summing entries of For keys and said Against tally summing entries of Against keys; c) comparing said For and Against tallies to determine whether or not they are so close as to signal uncertainty and, based upon a further syntactical characteristic of said block of data, resolving said uncertainty and indicating a value based on one of said tallies, said indication dependent upon whether said further syntactical characteristic indicates For or Against the language; d) indicating a value derived from the larger of the tallies in the event of no uncertainty between the tallies; and e) deciding, based upon said indicated value for each said expected language, the identity of a received language. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A data processing system including software, said system responsive to a plurality of input languages, each language adhering to a prescribed syntax wherein defined data portions ("For" keys) in incoming data indicate a vote for the presence of a language and the presence of other defined data portions ("Against" keys) indicate a vote Against the presence of the language, said system including a voter means for each language and a decider means, each said voter means comprising:
-
syntax means for analyzing, for an expected language, a syntax of an incoming block of data to identify For and Against keys in the block of data; tally means for providing For and Against tallies for said language in response to said analysis, each said tally being a summation of key entries, each key entry comprising an identified key count multiplied by a skew, the value of said skew indicating an importance of said key in said syntax and in the context of said block of data, said For tally summing entries of For keys and said Against tally summing entries of Against keys; first means for comparing said For and Against tallies to determine whether or not they are so close as to signal uncertainty, and, based upon a further syntactical characteristic of said block of data, resolving said uncertainty and indicating a first value based on one of said tallies; and means for indicating said first value or a second value to said decider means, said second value derived from the larger of the tallies in the event of no uncertainty between the tallies. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21)
-
Specification