Determining language of text fragments
First Claim
Patent Images
1. A system comprising:
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;
receiving content data, the content data comprising text;
responsive to determining that the content data is insufficient to accurately determine a natural language in which the text is written by determining that a number of characters of the text is less than a threshold number;
identifying an author of the text, the author being a first user of a social networking service;
retrieving social graph data corresponding to one or more social graphs associated with the first user, the social graph data being stored in a computer-readable storage device, the social graph data including a first set of language statistics and a second set of language statistics based on posts respectively authored by a second user and a third user that connect with the first user within the social networking service;
determining aggregate statistics from the first and second sets of language statistics based on a strength of relationship between the first and second users and a strength of relationship between the first and third users; and
determining the natural language that the text is written in as one of a plurality of potential natural languages based on the aggregate statistics included in the social graph data corresponding to the one or more social graphs associated with the author.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving content data, the content data comprising text, identifying an author of the text, the author being a user of a social networking service, retrieving data corresponding to one or more social graphs associated with the user, the data being stored in a computer-readable storage device, and determining the language of the text based on the data corresponding to the one or more social graphs associated with the author.
-
Citations
18 Claims
-
1. A system comprising:
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; receiving content data, the content data comprising text; responsive to determining that the content data is insufficient to accurately determine a natural language in which the text is written by determining that a number of characters of the text is less than a threshold number; identifying an author of the text, the author being a first user of a social networking service; retrieving social graph data corresponding to one or more social graphs associated with the first user, the social graph data being stored in a computer-readable storage device, the social graph data including a first set of language statistics and a second set of language statistics based on posts respectively authored by a second user and a third user that connect with the first user within the social networking service; determining aggregate statistics from the first and second sets of language statistics based on a strength of relationship between the first and second users and a strength of relationship between the first and third users; and determining the natural language that the text is written in as one of a plurality of potential natural languages based on the aggregate statistics included in the social graph data corresponding to the one or more social graphs associated with the author. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
17. A non-transitory computer storage medium encoded with a computer program, the computer program comprising instructions that when executed by data processing apparatus cause the data processing apparatus to perform operations comprising:
-
receiving content data, the content data comprising text; responsive to determining that the content data is insufficient to accurately determine a natural language in which the text is written by determining that a number of characters of the text is less than a threshold number; identifying an author of the text, the author being a first user of a social networking service; retrieving social graph data corresponding to one or more social graphs associated with the first user, the social graph data being stored in a computer-readable storage device, the social graph data including a first set of language statistics and a second set of language statistics based on posts respectively authored by a second user and a third user that connect with the first user within the social networking service; determining aggregate statistics from the first and second sets of language statistics based on a strength of relationship between the first and second users and a strength of relationship between the first and third users; and determining the natural language that the text is written in as one of a plurality of potential natural languages based on the aggregate statistics included in the social graph data corresponding to the one or more social graphs associated with the author.
-
-
18. A computer-implemented method comprising:
-
receiving content data, the content data comprising text; responsive to determining that the content data is insufficient to accurately determine a natural language in which the text is written by determining that a number of characters of the text is less than a threshold number; identifying an author of the text, the author being a first user of a social networking service; retrieving social graph data corresponding to one or more social graphs associated with the first user, the social graph data being stored in a computer-readable storage device, the social graph data including a first set of language statistics and a second set of language statistics based on posts respectively authored by a second user and a third user that connect with the first user within the social networking service; determining aggregate statistics from the first and second sets of language statistics based on a strength of relationship between the first and second users and a strength of relationship between the first and third users; and determining the natural language that the text is written in as one of a plurality of potential natural languages based on the aggregate statistics included in the social graph data corresponding to the one or more social graphs associated with the author.
-
Specification