Data extraction system, terminal, server, programs, and media for extracting data via a morphological analysis
First Claim
1. A data extraction system for extracting and accumulating prescribed data from web pages on the web, the data extraction system comprising:
- a plurality of terminals; and
a server connected to the plurality of terminals,wherein the server comprises;
a receiver for receiving the prescribed data, the prescribed data being extracted by at least one of the plurality of terminals and being a phrase having at least one part of speech of a morpheme;
a part-of-speech accumulator for accumulating the at least one part of speech of the morpheme;
a data accumulator for accumulating the prescribed data extracted by the at least one of the plurality of terminals and received by the receiver with extracted data; and
a verifier for verifying whether the prescribed data extracted by the at least one of the plurality of terminals and received by the receiver is already accumulated with the extracted data by the data accumulator, the data accumulator accumulating the prescribed data with the extracted data when the prescribed data is determined by the verifier to not be already accumulated with the extracted data, andwherein each terminal of the plurality of terminals comprises;
a searcher for searching for one of the web pages on the web;
a morphological analyzer for performing a morphological analysis on text data in the one of the web pages searched for by the searcher, the morphological analyzer receiving the at least one part of speech of the morpheme accumulated by the part-of-speech accumulator from the server in advance;
an extractor for extracting, as the prescribed data and from the text data in the one of the web pages on which the morphological analyzer performed the morphological analysis, the phrase that has the at least one part of speech of the morpheme that is received from the server in advance;
a sender for sending the prescribed data extracted by the extractor to the server; and
an interface for receiving, from the server, the prescribed data only when the prescribed data is determined by the verifier to not be already accumulated with the extracted data by the data accumulator and after the accumulator accumulates the prescribed data with the extracted data and not when the prescribed data is determined by the verifier to be already accumulated with the extracted data; and
a display for displaying the prescribed data on a display screen via the interface when the prescribed data is received from the server.
2 Assignments
0 Petitions
Accused Products
Abstract
This invention provides a terminal searching for web pages on the web and extracting the prescribed data from the web pages and a server verifying and accumulating the extracted data. The prescribed data can be extracted from the web pages on the web in a manner that the process relating to the data extraction is distributed between the terminal and the server. Therefore, necessary processes up to the data extraction are distributed, and the burden placed on each apparatus can be lessened. Further, new data not formerly found in the web pages can be found out and extracted from the web pages that has been updated or newly made.
39 Citations
17 Claims
-
1. A data extraction system for extracting and accumulating prescribed data from web pages on the web, the data extraction system comprising:
-
a plurality of terminals; and a server connected to the plurality of terminals, wherein the server comprises; a receiver for receiving the prescribed data, the prescribed data being extracted by at least one of the plurality of terminals and being a phrase having at least one part of speech of a morpheme; a part-of-speech accumulator for accumulating the at least one part of speech of the morpheme; a data accumulator for accumulating the prescribed data extracted by the at least one of the plurality of terminals and received by the receiver with extracted data; and a verifier for verifying whether the prescribed data extracted by the at least one of the plurality of terminals and received by the receiver is already accumulated with the extracted data by the data accumulator, the data accumulator accumulating the prescribed data with the extracted data when the prescribed data is determined by the verifier to not be already accumulated with the extracted data, and wherein each terminal of the plurality of terminals comprises; a searcher for searching for one of the web pages on the web; a morphological analyzer for performing a morphological analysis on text data in the one of the web pages searched for by the searcher, the morphological analyzer receiving the at least one part of speech of the morpheme accumulated by the part-of-speech accumulator from the server in advance; an extractor for extracting, as the prescribed data and from the text data in the one of the web pages on which the morphological analyzer performed the morphological analysis, the phrase that has the at least one part of speech of the morpheme that is received from the server in advance; a sender for sending the prescribed data extracted by the extractor to the server; and an interface for receiving, from the server, the prescribed data only when the prescribed data is determined by the verifier to not be already accumulated with the extracted data by the data accumulator and after the accumulator accumulates the prescribed data with the extracted data and not when the prescribed data is determined by the verifier to be already accumulated with the extracted data; and a display for displaying the prescribed data on a display screen via the interface when the prescribed data is received from the server. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A terminal apparatus connected to a server and used by a data extraction system for extracting prescribed data from web pages on the web, the terminal apparatus controlled by a processor and comprising:
-
a searcher, controlled by the processor, for searching for one of the web pages on the web; a morphological analyzer for performing a morphological analysis on text data in the one of the web pages searched for by the searcher, the morphological analyzer receiving at least one part of speech of a morpheme in advance; an extractor, controlled by the processor, for extracting as the prescribed data and from the text data in the one of the web pages on which the morphological analysis is performed, a phrase that has the at least one part of speech of the morpheme that is received in advance; a data sender, controlled by the processor, for sending the prescribed data extracted by the extractor to the server; a data receiver, controlled by the processor, for receiving, from the server, upon a verification of whether the prescribed data sent by the data sender is already accumulated with extracted data by a data accumulator of the server, the data accumulator accumulating the prescribed data with the extracted data when the prescribed data is determined to not be already accumulated with the extracted data, the prescribed data only when the prescribed data is determined to not be already accumulated with the extracted data by the data accumulator and after the data accumulator accumulates the prescribed data with the extracted data and not when the prescribed data is determined to be already accumulated with the extracted data; and a display, controlled by the processor, for displaying the prescribed data on a display screen when the prescribed data is received by the data receiver.
-
-
15. A non-transitory computer-readable medium embodying a program for a terminal apparatus connected to a server and used by a data extraction system for extracting prescribed data from web pages on the web, the program comprising:
-
a search process for searching for one of the web pages on the web; a morphological analysis process for performing a morphological analysis on text data in the one of the web pages searched for by the search process, the morphological analysis process receiving at least one part of speech of a morpheme in advance; an extraction process for extracting, as the prescribed data and from the text data in the one of the web pages on which the morphological analysis is performed, a phrase that has the at least one part of speech of the morpheme that is received in advance; a data sending process for sending the prescribed data extracted by the extraction process to the server; a data reception process for receiving, from the server, upon a verification of whether the prescribed data sent by the data sending process is already accumulated with extracted data by a data accumulation process of the server, the data accumulation process accumulating the prescribed data with the extracted data when the prescribed data is determined to not be already accumulated with the extracted data, the prescribed data only when the prescribed data is determined to not be already accumulated with the extracted data by the data accumulation process and after the data accumulation process accumulates the prescribed data with the extracted data and not when the prescribed data is determined to be already accumulated with the extracted data; and a display process for displaying the prescribed data on a display screen when the prescribed data is received by the data reception process.
-
-
16. A server apparatus used by a data extraction system for extracting and accumulating prescribed data from web pages on the web, the server apparatus connected to a plurality of terminals that search for one of the web pages on the web and extract the prescribed data from the one of the web pages, the server apparatus controlled by a processor and comprising:
-
a data receiver, controlled by the processor, for receiving the prescribed data, the prescribed data being extracted by at least one of the plurality of terminals and being a phrase having at least one part of speech of a morpheme; a part-of-speech accumulator for accumulating the at least one part of speech of the morpheme; a data accumulator, controlled by the processor, for accumulating the prescribed data received by the data receiver with extracted data; a verifier, controlled by the processor, for verifying whether the prescribed data received by the data receiver is already accumulated with the extracted data by the data accumulator, the data accumulator accumulating the prescribed data with the extracted data when the prescribed data is determined by the verifier to not be already accumulated with the extracted data; and a data transmitter, controlled by the processor, for sending the prescribed data to at least one of the plurality of terminals only when the prescribed data is determined by the verifier to not be accumulated with the extracted data by the data accumulator and after the data accumulator accumulates the prescribed data with the extracted data and not when the prescribed data is determined by the verifier to be already accumulated with the extracted data, so that the at least one of the plurality of terminals displays the prescribed data.
-
-
17. A non-transitory computer-readable medium embodying a program for a server apparatus used by a data extraction system for extracting and accumulating prescribed data from web pages on the web, the server apparatus connected to a plurality of terminals that search for one of the web pages on the web and extract the prescribed data from the one of the web pages, the program comprising:
-
a data reception process for receiving the prescribed data, the prescribed data being extracted by at least one of the plurality of terminals and being a phrase having at least one part of speech of a morpheme; a part-of-speech accumulation process for accumulating the at least one part of speech of the morpheme; a data accumulation process for accumulating the prescribed data received by the data reception process with extracted data; a verification process for verifying whether the prescribed data received by the data reception process is already accumulated with the extracted data by the data accumulation process, the data accumulation process accumulating the prescribed data with the extracted data when the prescribed data is determined by the verification process to not be already accumulated with the extracted data; and a data sending process for sending the prescribed data to at least one of the plurality of terminals only when the prescribed data is determined by the verification process to not be already accumulated with the extracted data by the data accumulation process and after the data accumulation process accumulates the prescribed data with the extracted data and not when the prescribed data is determined by the verifier to be already accumulated with the extracted data, so that the at least one of the plurality of terminals outputs the prescribed data.
-
Specification