Support for international search terms—translate as you crawl
First Claim
1. A search engine system that delivers search results via an Internet, the search engine system comprising:
- at least one processor configured to implement;
a web crawler module, the web crawler module configured to;
crawl the Internet to gather a first web page having first text content in a first language;
identify the first language during crawling operations, at least in part, by processing a domain name;
a language processing service module, the language processing service module configured to, during the crawling operations, translate the first text content into both a second language in a form of second text content and a third language in a form of third text content using both a thesaurus database and a conjugate terms database;
at least one database structure, the database structure, during the crawling operations, storing indexed representations of each of the first text content, the second text content, and the third text content; and
a search processing service configured, in response to receiving search input in the second language, to identify within the at least one database structure at least a portion of the indexed representation of the second text content.
8 Assignments
0 Petitions
Accused Products
Abstract
A search engine server delivers search results to a web browser of a client device communicatively coupled to the search engine server via the Internet. The system identifies new web pages in a source language during crawling, translates them into a plurality of destination languages, creates reverse indexes in respective languages, and stores both reverse indexes and cache web pages in a database. Upon the entry of search strings by a user using a web browser, the search engine server responds by delivering links of web pages in the user-desired language (the language of the search string or a language chosen by the user) as well as web pages translated from a plurality of destination languages, ranked based upon popularity or other means. The search engine server contains a plurality of translators that translate new web pages, links that are obtained during crawling, in to a plurality of destination languages.
115 Citations
20 Claims
-
1. A search engine system that delivers search results via an Internet, the search engine system comprising:
-
at least one processor configured to implement; a web crawler module, the web crawler module configured to; crawl the Internet to gather a first web page having first text content in a first language; identify the first language during crawling operations, at least in part, by processing a domain name; a language processing service module, the language processing service module configured to, during the crawling operations, translate the first text content into both a second language in a form of second text content and a third language in a form of third text content using both a thesaurus database and a conjugate terms database; at least one database structure, the database structure, during the crawling operations, storing indexed representations of each of the first text content, the second text content, and the third text content; and a search processing service configured, in response to receiving search input in the second language, to identify within the at least one database structure at least a portion of the indexed representation of the second text content. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method performed by a search engine system that delivers search results via an Internet, the method comprising:
-
gathering by a web crawler via the Internet a first web page having first text content in a first language; identifying the first language during crawling operations, at least in part, using a conjugate international language terms database that comprises terms in the first language and their conjugates in at least the second language; translating the first text content into both a second language in a form of second text content and a third language in a form of third text content during the crawling operations using both a thesaurus database and a conjugate terms database; storing, during the crawling operations, in at least one database structure indexed representations of each of the first text content, the second text content, and the third text content; and in response to receiving search input in the second language, identifying within the at least one database structure at least a portion of the indexed representation of the second text content. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification