THEMATIC WEB CORPUS
First Claim
1. Computer-implemented method for building a Web corpus that relates to a theme, the method comprising, by a server storing an index of a search engine, sending, to a client, the URLs of pages of a Web corpus that relates to the theme, including:
- receiving, from the client, a structured query that corresponds to the theme, the structured query consisting of a disjunction of at least one keyword;
determining in the index the group that consists of the URLs of all pages that match the query, wherein the determining consists in;
reading the keywords of the disjunction of the query on the index, thereby retrieving at least one set of URLs from the index, thenperforming on the retrieved at least one set of URLs a scheme of set operations that corresponds to the disjunction of the query, therebyleading to the group of URLs; and
sending to the client the URLs of the group as a stream.
1 Assignment
0 Petitions
Accused Products
Abstract
The invention notably relates to a computer-implemented method, performed by a server storing an index of a search engine, for sending, to a client, the URLs of pages of a Web corpus that relates to a theme. The method comprises receiving, from the client, a structured query that corresponds to the theme, the structured query consisting of a disjunction of at least one keyword; determining in the index the group that consists of the URLs of all pages that match the query; and sending to the client the URLs of the group as a stream.
Such a method improves the building of a thematic Web corpus.
19 Citations
17 Claims
-
1. Computer-implemented method for building a Web corpus that relates to a theme, the method comprising, by a server storing an index of a search engine, sending, to a client, the URLs of pages of a Web corpus that relates to the theme, including:
-
receiving, from the client, a structured query that corresponds to the theme, the structured query consisting of a disjunction of at least one keyword; determining in the index the group that consists of the URLs of all pages that match the query, wherein the determining consists in; reading the keywords of the disjunction of the query on the index, thereby retrieving at least one set of URLs from the index, then performing on the retrieved at least one set of URLs a scheme of set operations that corresponds to the disjunction of the query, thereby leading to the group of URLs; and sending to the client the URLs of the group as a stream. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. Computer-implemented method, performed by a client, for building a Web corpus that relates to a theme, wherein the method comprises:
-
sending, to a server, a structured query that corresponds to the theme, the structured query consisting of a disjunction of at least one keyword;
thenreceiving from the server the URLs of pages of the Web corpus as a stream. - View Dependent Claims (8, 9)
-
-
10. A non-transitory computer-readable medium having recorded thereon a computer program comprising instructions for performing a computer-implemented method for building a Web corpus that relates to a theme, the method comprising, by a server storing an index of a search engine, sending, to a client, the URLs of pages of a Web corpus that relates to the theme, including:
-
receiving, from the client, a structured query that corresponds to the theme, the structured query consisting of a disjunction of at least one keyword; determining in the index the group that consists of the URLs of all pages that match the query, wherein the determining consists in; reading the keywords of the disjunction of the query on the index, thereby retrieving at least one set of URLs from the index, then performing on the retrieved at least one set of URLs a scheme of set operations that corresponds to the disjunction of the query, thereby leading to the group of URLs; and sending to the client the URLs of the group as a stream.
-
-
11. A non-transitory computer-readable medium having recorded thereon a computer program comprising instructions for performing a computer-implemented method for building a Web corpus that relates to a theme, the method comprising, by a client:
-
sending, to a server, a structured query that corresponds to the theme, the structured query consisting of a disjunction of at least one keyword;
thenreceiving from the server the URLs of pages of the Web corpus as a stream.
-
-
12. A server system comprising a processor coupled to a memory having recorded thereon a search engine index and a computer program for performing a computer-implemented method for building a Web corpus that relates to a theme, the method comprising sending, to a client, the URLs of pages of a Web corpus that relates to the theme, including:
-
receiving, from the client, a structured query that corresponds to the theme, the structured query consisting of a disjunction of at least one keyword; determining in the index the group that consists of the URLs of all pages that match the query, wherein the determining consists in; reading the keywords of the disjunction of the query on the index, thereby retrieving at least one set of URLs from the index, then performing on the retrieved at least one set of URLs a scheme of set operations that corresponds to the disjunction of the query, thereby leading to the group of URLs; and sending to the client the URLs of the group as a stream. - View Dependent Claims (13, 14)
-
-
15. A client system comprising a processor coupled to a memory having recorded thereon a computer program comprising instructions for performing a computer-implemented method for building a Web corpus that relates to a theme, the method comprising:
-
sending, to a server, a structured query that corresponds to the theme, the structured query consisting of a disjunction of at least one keyword;
thenreceiving from the server the URLs of pages of the Web corpus as a stream. - View Dependent Claims (16, 17)
-
Specification