Methods and systems for assessing the quality of automatically generated text
First Claim
1. A computer-implemented method of assessing the quality of computer-generated text, the method comprising:
- receiving a plurality of characters generated from an image of a document;
determining, for the plurality of characters generated from the image of the document, language-conditional character probabilities based on a set of language models and an ordering of the characters, a language-conditional character probability for a target character in the plurality of characters describing a degree to which the target character and an ordered set of characters preceding the target character concord with a given language model in the set of language models;
identifying, for the target character, neighbor characters proximate to a location of the target character in the image of the document, wherein the neighbor characters have associated language-conditional character probabilities and are within a defined distance from the location of the target character in the image of the document;
combining the language-conditional character probabilities associated with the neighbor characters and the language-conditional character probabilities associated with the target character to generate a local language-conditional likelihood for the target character; and
storing the local language-conditional likelihood for the target character.
2 Assignments
0 Petitions
Accused Products
Abstract
A set of ordered characters is received in association with information specifying the locations of the characters within the image of the document. Language-conditional character probabilities for each character are determined based on a set of language models and the ordering of the characters. Neighbor characters associated with a target character are identified based on the locations of the characters. Language-conditional character probabilities associated with the neighbor characters and language-conditional character probabilities associated with the target character are combined to generate a local language-conditional likelihood associated with the target character, the local language-conditional likelihood representing a concordance of the target character to a language model.
-
Citations
19 Claims
-
1. A computer-implemented method of assessing the quality of computer-generated text, the method comprising:
-
receiving a plurality of characters generated from an image of a document; determining, for the plurality of characters generated from the image of the document, language-conditional character probabilities based on a set of language models and an ordering of the characters, a language-conditional character probability for a target character in the plurality of characters describing a degree to which the target character and an ordered set of characters preceding the target character concord with a given language model in the set of language models; identifying, for the target character, neighbor characters proximate to a location of the target character in the image of the document, wherein the neighbor characters have associated language-conditional character probabilities and are within a defined distance from the location of the target character in the image of the document; combining the language-conditional character probabilities associated with the neighbor characters and the language-conditional character probabilities associated with the target character to generate a local language-conditional likelihood for the target character; and storing the local language-conditional likelihood for the target character. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer-implemented method of assessing the quality of computer-generated text, the method comprising:
-
receiving a target character and a set of ordered characters preceding the target character; determining at least a first language-conditional character probability for the target character based on at least a first language model and the ordering of the characters in the set; identifying neighbor characters within a defined distance from a location of the target character in a digital text from which the target character and the set of ordered characters preceding the target character were generated; determining at least a first language-conditional character probability for each identified neighbor character based on at least the first language model and an ordering of characters preceding a neighbor character; and combining the language-conditional character probabilities associated with the neighbor characters and the language-conditional character probabilities associated with the target character to generate a local language-conditional likelihood for the target character, wherein the local language-conditional likelihood represents a concordance of the target character to at least the first language model; and storing the local language-conditional likelihood for the target character.
-
-
11. A computer system for assessing the quality of computer-generated text, comprising:
-
a processor for executing computer program instructions; a computer-readable storage medium storing executable computer program instructions, the computer program instructions comprising; a language-conditional character probability module executable to; receive a plurality of characters generated from an image of a document; and determine, for the plurality of characters generated from the image of the document, language-conditional character probabilities based on a set of language models and an ordering of the characters, a language-conditional character probability for a target character in the plurality of characters describing a degree to which the target character and an ordered set of characters preceding the target character concord with a given language model in the set of language models; and a local language-conditional likelihood module executable to; identify, for the target character, neighbor characters proximate to a location of the target character in the image of the document, wherein the neighbor characters have associated language-conditional character probabilities and are within a defined distance from the location of the target character in the image of the document; combine the language-conditional character probabilities associated with the neighbor characters and the language-conditional character probabilities associated with the target character to generate a local language-conditional likelihood for the target character, and store the local language-conditional likelihood for the target character. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
Specification