Generating context-based spell corrections of entity names
First Claim
1. A system comprising:
- one or more computers including one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to perform operations comprising;
receiving texts from each of a plurality of text sources, wherein each text source provides a text;
deriving a plurality of name-context pairs from the texts, wherein each name-context pair comprises an entity name included in the text from a text source and a context term included in the text from the text source, wherein each entity name is one or more terms used to refer to a respective entity and each context term is a term that appears in text associated with the entity name;
calculating a context consistency measure for each distinct name-context pair, wherein the context consistency measure for a particular name-context pair is an estimate of a probability that, if the entity name of the particular name-context pair appears in text, the context term of the particular name-context pair will also appear in the text; and
storing context-entity name data, wherein the context-entity name data is searchable data that represents one or more of the distinct name-context pairs and the context consistency measure for each of the one or more name-context pair.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for correcting entity names. One method includes receiving texts and deriving a plurality of name-context pairs from the texts. The method further includes calculating a context consistency measure for each name-context pair and storing context-entity name data representing the name-context pairs. Another method includes identifying an entity name and one or more context terms from a query and generating candidate names for the entity name. The method further includes determining a score for each of the candidate names, selecting a number of top scoring candidate names, and using the selected candidate names to respond to the query.
42 Citations
57 Claims
-
1. A system comprising:
one or more computers including one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to perform operations comprising; receiving texts from each of a plurality of text sources, wherein each text source provides a text; deriving a plurality of name-context pairs from the texts, wherein each name-context pair comprises an entity name included in the text from a text source and a context term included in the text from the text source, wherein each entity name is one or more terms used to refer to a respective entity and each context term is a term that appears in text associated with the entity name; calculating a context consistency measure for each distinct name-context pair, wherein the context consistency measure for a particular name-context pair is an estimate of a probability that, if the entity name of the particular name-context pair appears in text, the context term of the particular name-context pair will also appear in the text; and storing context-entity name data, wherein the context-entity name data is searchable data that represents one or more of the distinct name-context pairs and the context consistency measure for each of the one or more name-context pair. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
20. A computer-implemented method, comprising:
-
receiving texts from each of a plurality of text sources, wherein each text source provides a text; deriving a plurality of name-context pairs from the texts, wherein each name-context pair comprises an entity name included in the text from a text source and a context term included in the text from the text source, wherein each entity name is one or more terms used to refer to a respective entity and each context term is a term that appears in text associated with the entity name; calculating a context consistency measure for each distinct name-context pair, wherein the context consistency measure for a particular name-context pair is an estimate of a probability that, if the entity name of the particular name-context pair appears in text, the context term of the particular name-context pair will also appear in the text; and storing context-entity name data, wherein the context-entity name data is searchable data that represents one or more of the distinct name-context pairs and the context consistency measure for each of the one or more name-context pair. - View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38)
-
-
39. A computer storage medium storing instructions that, when executed by data processing apparatus, cause the one or more computers to perform operations comprising:
-
receiving texts from each of a plurality of text sources, wherein each text source provides a text; deriving a plurality of name-context pairs from the texts, wherein each name-context pair comprises an entity name included in the text from a text source and a context term included in the text from the text source, wherein each entity name is one or more terms used to refer to a respective entity and each context term is a term that appears in text associated with the entity name; calculating a context consistency measure for each distinct name-context pair, wherein the context consistency measure for a particular name-context pair is an estimate of a probability that, if the entity name of the particular name-context pair appears in text, the context term of the particular name-context pair will also appear in the text; and storing context-entity name data, wherein the context-entity name data is searchable data that represents one or more of the distinct name-context pairs and the context consistency measure for each of the one or more name-context pair. - View Dependent Claims (40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57)
-
Specification