Method and device for recording and searching for a document on a communication network
First Claim
Patent Images
1. A method of recording information relating to a document accessed by a user (16) of a computer communication network (4), comprising the following steps:
- condensing (E8) the accessed document to obtain key words associated with the accessed document;
associating (E9) a binary code with each obtained key word to form associations;
storing (E46) the associations in a dictionary (10); and
storing (E16) an electronic address (URL) of the accessed document and the binary codes in association with each other in information history storage means (2) of the user (16).
1 Assignment
0 Petitions
Accused Products
Abstract
A system for recording information relating to a document, which is accessible via a computer communication network, operates by extracting key words associated with the document, associating a binary code with each extracted key word to form associations, storing the associations in a dictionary, and storing an electronic address (URL) of the document and the binary codes in association with each other in an information storage unit of a user.
-
Citations
52 Claims
-
1. A method of recording information relating to a document accessed by a user (16) of a computer communication network (4), comprising the following steps:
-
condensing (E8) the accessed document to obtain key words associated with the accessed document;
associating (E9) a binary code with each obtained key word to form associations;
storing (E46) the associations in a dictionary (10); and
storing (E16) an electronic address (URL) of the accessed document and the binary codes in association with each other in information history storage means (2) of the user (16). - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 31, 32, 35, 36, 39, 40, 43, 44, 45, 46, 47)
checking (E43) as to whether or not an obtained key word exists in the dictionary (10); and
if the obtained key word does not exist, creating (E45) a new binay code and associating the new binary code with the obtained key word;
orif the obtained key word exists, reading (E44) a binary code associated with the obtained key word in the dictionary (10).
-
-
3. A method according to one of claims 1 or 2, wherein the binary codes of the dictionary (10) are fixed-length codes.
-
4. A method according to one of claims 1 or 2, wherein the binary codes are variable-length codes.
-
5. A method according to claim 2,
wherein the binary codes have a length of M bits, determined according to a current maximum number 2M of associations stored in the dictionary, such that a size of the dictionary is changeable, and wherein in the substep of creating (E45) a new binary code, if a number of associations stored in the dictionary (10) increases to a number greater than 2M, the binary codes of the dictionary (10) are reconstructed on binary codes of length M+1 bits. -
6. A method according to one of claims 1 or 2, wherein the associations of key words and binary codes stored in the dictionary (10) are compressed by an entropic coding method.
-
7. A method according to one of claims 1 or 2, wherein the information history storage means (2) is part of a history file (2) associated with a browser (1) of the user (16).
-
8. A method according to one of claims 1 or 2, further comprising the step of storing (E17), in the information history storage means (2), an authentication signature (CRC2) in association with the electronic address of the accessed document, if the electronic address of the accessed document is not already stored in the information history storage means.
-
9. A method according to claim 8, further comprising the following steps:
-
checking (E3) for an existence or not of the electronic address of the accessed document in the information history storage means (2) of the user (16);
calculating (E2) an authentication signature (CRC1) associated with the accessed document;
comparing (E5) the calculated authentication signature (CRC1) and the authentication signature (CRC2) stored in the information history storage means (2); and
reiterating the steps of condensing (E8) the accessed document to obtain key words, associating (E9) a binary code, storing (E46) the associations, storing (E12) updated binary codes, and storing (E13) the calculated authentication signature (CRC1) in the information history storage means (2) of the user (16) as the updated authentication signature, when the calculated and stored authentication signatures are different.
-
-
10. A method according to one of claims 1 or 2, wherein said step of extracting (8) key words comprises the following substeps:
-
determining (E31) a format of the accessed document (8);
eliminating (E32), in the accessed document (8), one or more commands from a list of commands (33) to be eliminated for a given format;
determining (E34) a language of the accessed document;
eliminating (E35), in the accessed document, a series of common words from a list of common words (36) to be eliminated for a given language;
eliminating (E37), in the accessed document, a series of terminations from a list of terminations (38) to be eliminated for a given language;
making uniform (E39) a format of writing the words of the accessed document; and
eliminating (E40) duplicates of words in the accessed document.
-
-
11. A method according to one of claims 1 or 2, further comprising the step of indexing the electronic address of the accessed document by means of the binary code or codes in the information history storage means (2′
- ) of the user (16).
-
12. A search method of searching for a document on a computer communication network (4) from information recorded by a recording method according to claim 1, wherein said search method comprises the following steps:
-
supplying (E50), by a user, a search criterion that includes at least one key word;
reading (E52), from the dictionary (10), a respective binary code associated with each key word supplied by the user;
extracting (E53), from the information history storage means (2), an electronic address (URL) of an accessed document associated with each read binary code; and
downloading (E55, E56, E57) the accessed document or documents.
-
-
13. A search method according to claim 12,
wherein the search criterion includes more than one key word, wherein the step of reading (E52) includes reading, from the dictionary (10), binary codes associated with the key words, and further comprising the step of filtering (E54) the extracted electronic address or addresses, wherein the step of filtering includes the following substeps: -
comparing a number of the read binary code or codes associated with the extracted electronic address or addresses with a threshold value (T); and
eliminating an electronic address or addresses associated with a number of the read binary code or codes lower than the threshold value T.
-
-
14. A search method according to one of claims 12 or 13, wherein the recording method comprises the steps of:
-
storing (E13, E17) an authentication signature (CRC2) associated with each document, and updating (E58, E59, E60) the information history storage means (2), wherein the step of updating comprises the following substeps;
eliminating (E63), in the information history storage means, a document or documents that no longer exist at an associated electronic address or addresses;
calculating (E2) the authentication signature (CRC1) of the downloaded document or documents;
comparing (E5) the calculated authentication signature (CRC1) and the authentication signature (CRC2) stored in the information history storage means (2); and
reiterating the steps of condensing (E8) the accessed document to obtain keywords, associating (E9) a binary code, storing (E46) the associations, storing (E12) updated binary codes, and storing (E13) the calculated authentication signature (CRC1) of the recording method as an updated authentication signature, when the calculated and stored authentication signatures are different.
-
-
15. A search method according to one of claims 12 or 13, wherein the search criterion comprises a regular expression of the key word or key words.
-
31. A computer adapted to implement a recording method according to one of claims 1 or 2.
-
32. A computer adapted to implement a search method according to one of claims 12 or 13.
-
35. A computer server adapted to implement a recording method according to one of claims 1 or 2.
-
36. A computer server adapted to implement a search method according to one of claims 12 or 13.
-
39. A computer communication network adapted to implement a recording method according to one of claims 1 or 2.
-
40. A computer communication network adapted to implement a search method according to one of claims 12 or 13.
-
43. A computer communication network comprising a plurality of computer servers according to claim 35.
-
44. A computer communication network in accordance with claim 43, wherein the communication network is a wide area network.
-
45. An information storage medium, partially or totally removable, readable by a computer and storing a computer program for implementing a recording method according to one of claims 1 or 2.
-
46. An information storage medium, partially or totally removable, readable by a computer and storing a computer program for implementing a search method according to one of claims 12 or 13.
-
47. A computer communication network comprising a plurality of computer servers according to claim 36.
-
16. A device for recording information relating to a document accessed by a user (16) of a computer communication network (4), said device comprising:
-
means (7) for extracting key words associated with the accessed document, which is accessed via the computer communication network;
means (11) for associating a binary code with each extracted key word to form associations;
a dictionary (10) for storing the associations; and
information history storage means (2) for storing an electronic address (URL) of the accessed document and the binary codes in association with each other. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 33, 34, 37, 38, 41, 42, 48, 49)
wherein the binary codes have a length of M bits, determined according to a current maximum number 2M of associations stored in the dictionary, such that a size of the dictionary is changeable, and wherein said means of associating (11) a binary code comprises means for reconstructing (15) the binary codes of the dictionary on binary codes of length M+1 bits, when a number of associations stored in the dictionary increases to a number greater than 2M. -
21. A device according to one of claims 16 or 17, wherein the associations of key words and binary codes stored in the dictionary (10) are compressed by an entropic coding method.
-
22. A device according to one of claims 16 or 17, wherein the information history storage means (2) is part of a history file (2) associated with a browser (1) of the user (16).
-
23. A device according to one of claims 16 or 17, wherein the information history storage means (2) stores an authentication signature (CRC2) associated with the accessed document.
-
24. A device according to claim 23, further comprising:
-
means (36) for checking for an existence or not of the electronic address of the accessed document in the information history storage means (2) of the user (16);
means (37) for calculating an authentication signature (CRC1) associated with the accessed document;
means (38) for comparing the calculated authentication signature (CRC1) and the authentication signature (CRC2) stored in the information history storage means (2).
-
-
25. A device according to one of claims 16 or 17, wherein the information history storage means (2) comprises means for indexing (34) the electronic address of the accessed document by means of the binary code or codes.
-
26. A device according to one of claims 16 or 17, wherein the device is incorporated in a microprocessor (500), which includes a read only memory (501) storing a program for recording information and a random access memory (502) with registers adapted to record variables modified during running of the program.
-
27. A search device for searching, by a user, for a document on a computer communication network using information recorded by a recording device according to claim 16, wherein the search device comprises:
-
means (17) for supplying a search criterion, provided by a user, that includes at least one key word;
means (14) for reading, from the dictionary (10), a respective binary code associated with each key word provided by the user;
means (3) for extracting, from the information history storage means (2), an electronic address of the accessed document or documents associated with each read binary code; and
means (3) for downloading the accessed document or documents.
-
-
28. A search device according to claim 27,
wherein the search criterion includes more than one key word, wherein the means for reading (14) reads, from the dictionary (10), binary codes associated with the key words, and further comprising means for filtering (18) the extracted electronic address or addresses, wherein the means for filtering includes: -
means for comparing a number of the read binary code or codes associated with the extracted electronic address or addresses with a threshold value; and
means for eliminating an electronic address or addresses associated with a number of the read binary code or codes less than the threshold value.
-
-
29. A search device incorporated in a recording device according to claim 23, the search device adapted to search for a document on a computer communication network using information recorded by the recording device, the search device comprising:
-
means (17) for supplying a search criterion, provided by a user, that includes more than one key word;
means (11) for reading, from the dictionary (10), a respective binary code associated with each key word;
means (3) for extracting, from the information history storage means (2), an electronic address of the accessed document or documents associated with each read binary code;
means (3) for downloading the accessed document or documents;
means for filtering (18) the extracted electronic address or addresses, wherein the means for filtering includes;
means for comparing a number of the read binary code or codes associated with the extracted electronic address or addresses with a threshold value, and means for eliminating an electronic address or addresses associated with a number of the read binary code or codes less than the threshold value; and
means for eliminating (39), in the information history storage means, an accessed document or documents that no longer exist at an associated electronic address or addresses.
-
-
30. A search device according to one of claims 27 or 28, wherein the search device is incorporated in a microprocessor (500), which includes a read only memory (501) storing a program for searching for documents and a random access memory (502) with registers adapted to record variables modified during running of the program.
-
33. A computer comprising a recording device according to one of claims 16 or 17.
-
34. A computer comprising a search device according to one of claims 27 or 28.
-
37. A computer server comprising a recording device according to one of claims 16 or 17.
-
38. A computer server comprising a search device according to one of claims 27 or 28.
-
41. A computer communication network comprising a recording device according to one of claims 16 or 17.
-
42. A computer communication network comprising a search device according to one of claims 27 or 28.
-
48. A computer communication network comprising a plurality of computer servers according to claim 37.
-
49. A computer communication network comprising a plurality of computer servers according to claim 38.
-
-
50. A method of recording information relating to a document accessed by a user (16) of a computer communication network (4), comprising the following steps:
-
condensing (E8) the accessed document to obtain key words associated with the accessed document;
associating (E9) a binary code with each extracted key word to form associations;
updating binary codes associated with the accessed document when the contents of the accessed document have been modified;
storing (E46) the associations in a dictionary (10); and
storing (E12) an electronic address (URL) of the accessed document and updated binary codes in association with each other in information history storage means (2) of the user (16). - View Dependent Claims (51)
-
-
52. A device for recording information relating to a document accessed by a user (16) of a computer communication network (4), said device comprising:
-
means (7) for extracting key words associated with the accessed document, which is accessed via the computer communication network;
means (11) for associating a binary code with each extracted key word to form associations;
a dictionary (10) for storing the associations, means for updating binary codes associated with the accessed document when the contents of the accessed document has been modified; and
information history storage means (2) for storing an electronic address (URL) of the accessed document and updated binary codes in association with each other.
-
Specification