Method of searching text to find relevant content

US 8,019,754 B2
Filed: 03/30/2007
Issued: 09/13/2011
Est. Priority Date: 04/03/2006
Status: Expired due to Fees

First Claim

Patent Images

1. A method of locating relevant documents within a universe of documents, the documents of said universe having been classified so that each document in the universe has a fingerprint, said fingerprint comprising a list of weights associated with particular topic categories in a classification system, each of the weights representing a degree to which the document relates to the particular topic category that the weight is associated with, the weights obtained automatically from a computer program, the method having a scalable time complexity of O(N^x) where 0<

=X<

=1.0 for a universe of text on the world wide web and comprising;

a computer processor creating a fingerprint for a piece of text, the fingerprint comprising a list of weights associated with particular topic categories in the classification system, each of the weights in the fingerprint for said first piece of text representing a degree to which the first piece of text relates to the particular topic category that the weight in the fingerprint for said first piece of text is associated with, the weights in the fingerprint for said first piece of text obtained automatically from a computer program,a computer processor searching all or a portion of the universe of documents by comparing the fingerprint for the first piece of text with the fingerprint for each document in the all or a portion of the universe of documents, andranking the all or a portion of the universe of documents based on a degree to which a document has a mathematical overlap with the fingerprint of the first piece of text, the method configured to locate the relevant documents within the universe of documents whether the universe of documents includes text written in one language or in more than one language.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of locating relevant documents wherein documents are given a fingerprint comprising weights associated with particular topic categories of a classification system, each weight representing a degree to which the document relates to the particular topic category, a first piece of text is identified and given a fingerprint comprising a list of other weights associated with similar topic categories, the other weights representing a degree to which the first piece of text relates to the particular topic category. All or a portion of the universe of documents is searched by comparing the fingerprint for the first piece of text with the fingerprint for each document. You select those documents whose fingerprints have a predetermined degree of mathematical overlap with the fingerprint of the first piece of text. A user fingerprint of the user'"'"'s recently accessed texts can be used in place of the first piece of text.

Citations

73 Claims

1. A method of locating relevant documents within a universe of documents, the documents of said universe having been classified so that each document in the universe has a fingerprint, said fingerprint comprising a list of weights associated with particular topic categories in a classification system, each of the weights representing a degree to which the document relates to the particular topic category that the weight is associated with, the weights obtained automatically from a computer program, the method having a scalable time complexity of O(N^x) where 0<
- =X<
  
  =1.0 for a universe of text on the world wide web and comprising;
  
  a computer processor creating a fingerprint for a piece of text, the fingerprint comprising a list of weights associated with particular topic categories in the classification system, each of the weights in the fingerprint for said first piece of text representing a degree to which the first piece of text relates to the particular topic category that the weight in the fingerprint for said first piece of text is associated with, the weights in the fingerprint for said first piece of text obtained automatically from a computer program,a computer processor searching all or a portion of the universe of documents by comparing the fingerprint for the first piece of text with the fingerprint for each document in the all or a portion of the universe of documents, andranking the all or a portion of the universe of documents based on a degree to which a document has a mathematical overlap with the fingerprint of the first piece of text, the method configured to locate the relevant documents within the universe of documents whether the universe of documents includes text written in one language or in more than one language.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1, wherein either the weights are scaled or else the mathematical overlaps are scaled.
  - 3. The method of claim 2, wherein (i) a fingerprint of a document, (ii) a fingerprint of the first piece of text or (iii) a fingerprint of a document and a fingerprint of the first piece of text only includes those topic categories whose associated weights are among a selected number of highest associated weights for that document.
  - 4. The method of claim 1, wherein a selected number of weights is between 1 and 75 and the classification system is the Dewey Decimal System.
  - 5. The method of claim 1, wherein a level of precision is set by setting a number of digits in a classification code of the classification system.

6. A method of locating relevant documents within a universe of documents, the documents of said universe having been classified so that each document in the universe has a fingerprint, said fingerprint comprising a list of weights associated with particular topic categories in a classification system, each of the weights representing a degree to which the document relates to the particular topic category that the weight is associated with, the weights obtained automatically from a computer program, the method having a scalable time complexity of O(N^x) where 0<
- =X<
  
  =1.0 for a universe of text on the world wide web and comprising;
  
  a computer processor creating a fingerprint for a first piece of text, the fingerprint comprising a list of weights associated with particular topic categories in the classification system, each of the weights in the fingerprint for said first piece of text representing a degree to which the first piece of text relates to the particular topic category that the weight in the fingerprint for said first piece of text is associated with, the weights in the fingerprint for said first piece of text obtained automatically from a computer program,a computer processor searching all or a portion of the universe of documents by comparing the fingerprint for the first piece of text with the fingerprint for each document in that all or a portion of the universe of documents, and selecting those documents whose fingerprints have a predetermined degree of mathematical overlap with the fingerprint of the first piece of text, the method configured to locate the relevant documents within the universe of documents whether the universe of documents includes text written in one language or in more than one language.
- View Dependent Claims (7, 8, 9)
- - 7. The method of claim 6, wherein either the weights are scaled or else the mathematical overlaps are scaled.
  - 8. The method of claim 7, wherein a (i) fingerprint of a document, (ii) a fingerprint of the first piece of text or (iii) a fingerprint of a document and a fingerprint of the first piece of text only includes those topic categories whose associated weights are among a selected number of highest associated weights for that document.
  - 9. The method of claim 6, wherein a selected number of weights is between 1 and 75 and the classification system is the Dewey Decimal System.

10. A method of locating relevant documents within a universe of documents, the method having a scalable time complexity of O(N^x) where 0<
- =X<
  
  =1.0 for a universe of text on the world wide web and comprising;
  
  a computer processor creating a fingerprint for each document in the universe of documents, said fingerprint comprising a list of weights associated with particular topic categories in a classification system, each of the weights representing a degree to which the document relates to the particular topic category that the weight is associated with, the weights obtained automatically from a computer program,a computer processor creating a fingerprint for a first piece of text, the fingerprint comprising a list of weights associated with particular topic categories in the classified system, each of the weights in the fingerprint for said first piece of text representing a degree to which the first piece of text relates to the particular topic category that the weight in the fingerprint for said first piece of text is associated with, the weights in the fingerprint for said first piece of text obtained automatically from a computer program,a computer processor searching all or a portion of the universe of documents by comparing the fingerprint for the first piece of text with the fingerprint for each document in that all or a portion of the universe of documents, andranking the all or a portion of the universe of documents based on a degree to which a document has a mathematical overlap with the fingerprint of the first piece of text, the method configured to locate the relevant documents within the universe of documents whether the universe of documents includes text written in one language or in more than one language.
- View Dependent Claims (11, 12, 13, 72, 73)
- - 11. The method of claim 10, wherein either the weights are scaled or else the mathematical overlaps are scaled.
  - 12. The method of claim 11, wherein (i) a fingerprint of a document, (ii) a fingerprint of the first piece of text or (iii) a fingerprint of a document and a fingerprint of the first piece of text only includes those topic categories whose associated weights are among a selected number of highest associated weights for that document.
  - 13. The method of claim 10, wherein a selected number of weights is between 1 and 75 and the classification system is the Dewey Decimal System.
  - 72. A method according to claim 2 or 7 or 11 or 15 or 19 or 23 or 27 or 32 or 37 or 41 or 46 or 51 or 56 or 60 or 64 or 68, wherein a numerical magnitude of topic categories is adjusted.
  - 73. A method according to claim 2 or 7 or 11 or 15 or 19 or 23 or 27 or 32 or 37 or 41 or 46 or 51 or 56 or 60 or 64 or 68, wherein some weights of the fingerprint are disregarded because they are not relevant.

14. A method of locating relevant documents within a universe of documents, the method having a scalable time complexity of O(N^x) where 0<
- =X<
  
  =1.0 for a universe of text on the world wide web and comprising;
  
  creating a fingerprint for each document in the universe of documents, said fingerprint comprising a list of weights associated with particular topic categories in a classification system, each of the weights representing a degree to which the document relates to the particular topic category that the weight is associated with, the weights obtained automatically from a computer program,a computer processor creating a fingerprint for a first piece of text, the fingerprint comprising a list of weights associated with particular topic categories in the classification system, each of the weights in the fingerprint for said first piece of text representing a degree to which the first piece of text relates to the particular topic category that the weight in the fingerprint for said first piece of text is associated with, the weights in the fingerprint for said first piece of text obtained automatically from a computer program,a computer processor searching all or a portion of the universe of documents by comparing the fingerprint for the first piece of text with the fingerprint for each document in that all or a portion of the universe of documents, andselecting those documents whose fingerprints have a predetermined degree of mathematical overlap with the fingerprint of the first piece of text, the method configured to locate the relevant documents within the universe of documents whether the universe of documents includes text written in one language or in more than one language.
- View Dependent Claims (15, 16, 17)
- - 15. The method of claim 14, wherein either the weights are scaled or else the mathematical overlaps are scaled.
  - 16. The method of claim 15, wherein (i) a fingerprint of a document, (ii) a fingerprint of the first piece of text or (iii) a fingerprint of a document and a fingerprint of the first piece of text only includes those topic categories whose associated weights are among a selected number of highest associated weights for that document.
  - 17. The method of claim 14, wherein a selected number of weights is between 1 and 75 and the classification system is the Dewey Decimal System.

18. A method of re-ranking a list of documents obtained from a search wherein a ranking of a document in the list is determined by a relevance of the document to a search text, the method having a scalable time complexity of O(N^x) where 0<
- =X<
  
  =1.0 for a universe of text on the world wide web and comprising;
  
  a computer processor classifying the list of documents so that each document in the list has a fingerprint, said fingerprint comprising a list of weights associated with particular topic categories in a classification system, each of the weights representing a degree to which the document relates to the particular topic category that the weight is associated with, the weights obtained automatically from a computer program,a computer processor creating a fingerprint for the first piece of text, the fingerprint comprising a list of weights associated with particular topic categories in the classification system, each of the weights in the fingerprint for said first piece of text representing a degree to which the first piece of text relates to the particular topic category that the weight in the fingerprint for said first piece of text is associated with, the weights in the fingerprint for said first piece of text obtained automatically from a computer program,a computer processor searching the list of documents by comparing the fingerprint for the first piece of text with the fingerprint for each document in the list of documents,re-ranking the list of documents based on a degree to which a document in the list has a mathematical overlap with the fingerprint of the first piece of text, the method configured to re-rank the list of documents based on relevance to the search text whether the list of documents includes text written in one language or in more than one language.
- View Dependent Claims (19, 20, 21)
- - 19. The method of claim 18, wherein either the weights are scaled or else the mathematical overlaps are scaled.
  - 20. The method of claim 19, wherein (i) a fingerprint of a document, (ii) a fingerprint of the first piece of text or (iii) a fingerprint of a document and a fingerprint of the first piece of text only includes those topic categories whose associated weights are among a selected number of highest associated weights for that document.
  - 21. The method of claim 18, wherein a selected number of weights is between 1 and 75 and the classification system is the Dewey Decimal System.

22. A method of re-ranking a list of documents obtained from a search wherein a ranking of a document in the list is determined by a relevance of the document to a search text, and wherein each document in the list has a fingerprint, said fingerprint comprising a list of weights associated with particular topic categories in a classification system, each of the weights representing a degree to which the document related to the particular topic category that the weight is associated with, the weights obtained automatically from a computer program, the method having a scalable time complexity of O(N^x) where 0<
- =X<
  
  =1.0 for a universe of text on the world wide web and comprising;
  
  a computer processor creating a fingerprint for a first piece of text, the fingerprint comprising a list of weights associated with particular topic categories in the classification system, each of the weights in the fingerprint for said first piece of text representing a degree to which the first piece of text relates to the particular topic category that the weight in the fingerprint for the first piece of text is associated with, the weights in the fingerprint for said first piece of text obtained automatically from a computer program,a computer processor searching the list of documents by comparing the fingerprint for the first piece of text with the fingerprint for each document in the list of documents,re-ranking the list of documents based on a degree to which a fingerprint of a document in the list has a mathematically overlap with the fingerprint of the first piece of text, the method configured to re-rank the list of documents based on relevance to the search text whether the list of documents includes text written in one language or in more than one language.
- View Dependent Claims (23, 24, 25)
- - 23. The method of claim 22, wherein either the weights are scaled or else the mathematical overlaps are scaled.
  - 24. The method of claim 23, wherein (i) a fingerprint of a document, (ii) a fingerprint of the first piece of text or (iii) a fingerprint of a document and a fingerprint of the first piece of text only includes those topic categories whose associated weights are among a selected number of highest associated weights for that document.
  - 25. The method of claim 22, wherein a selected number of weights is between 1 and 75 and the classification system is the Dewey Decimal System.

26. A method of re-ranking a list of document obtained from a search wherein a ranking of a document in the list of documents is determined by a relevance of the document to a search text, wherein the list of documents has been classified and appears in an inverted list, said inverted list comprising for each topic category of a classification system a weight associated with a particular document of the list of documents, the weight representing a degree to which the particular document relates to said each topic category, the weights obtained automatically from a computer program, the method having a scalable time complexity of O(N^x) where 0<
- =X<
  
  =1.0 for a universe of text on the world wide web and comprising;
  
  creating a fingerprint for a first piece of text, the fingerprint comprising a list of weights associated with particular topic categories in the classification system, each of the weights in the fingerprint for said first piece of text representing a degree to which the first piece of text relates to the particular topic category that the weight in the fingerprint for said first piece of text is associated with, the weights in the fingerprint for said first piece of text obtained automatically from a computer program,a computer processor searching the list of documents by comparing the fingerprint for the first piece of text with the fingerprint for each document in the list of documents, anda computer processor re-ranking the list of documents based on a degree to which a fingerprint of a document in the list has a mathematical overlap with the fingerprint of the first piece of text, the method configured to re-rank the list of documents based on relevance to the search text whether the list of documents includes text written in one language or in more than one language.
- View Dependent Claims (27, 28, 29, 30)
- - 27. The method of claim 26, wherein either the weights are scaled or else the mathematical overlaps are scaled.
  - 28. The method of claim 27, wherein an upper bound to the mathematical overlap is calculated dynamically, said upper bound used to reduce a magnitude of documents for which the mathematical overlap is calculated.
  - 29. The method of claim 27, wherein a fingerprint of a document, (ii) a fingerprint of the first piece of text or (iii) a fingerprint of a document and a fingerprint of the first piece of text only includes those topic categories whose associated weights are among a selected number of highest associated weights for that document.
  - 30. The method of claim 26, wherein a selected number of weights is between 1 and 75 and the classification system is the Dewey Decimal System.

31. A method of re-ranking a list of documents obtained from a search wherein a ranking of a document in the list is determined by a relevance of the document to a search text, the method performed by a computer processor, having a scalable time complexity of O(N^x) where 0<
- =X<
  
  =1.0, and comprising;
  
  classifying the list of documents so that each document in the list has a fingerprint, said fingerprint comprising a list of weights associated with particular topic categories in a classification system, each of the weights representing a degree to which the document relates to the particular topic category that the weight is associated with, the weights obtained automatically from a computer program,providing a user fingerprint, the user fingerprint comprising a list of cumulative weights associated with particular topic categories in the classification system, each of the cumulative weights representing a degree to which text or texts in a link recently accessed by a user relates to the particular topic category that the cumulative weight is associated with, the cumulative weights obtained from weights that in turn were obtained automatically from a computer program,searching the list of documents by comparing the user fingerprint with the fingerprint for each document in the list of documents, andre-ranking the list of documents based on a degree to which a fingerprint of the document in the list has a mathematical overlap with the user fingerprint, the method configured to re-rank the list of documents based on relevance to the search text whether the list of documents includes text written in one language or in more than one language.
- View Dependent Claims (32, 33, 34, 35, 71)
- - 32. The method of claim 31, wherein either the weights are scaled or else the mathematical overlaps are scaled.
  - 33. The method of claim 32, wherein a fingerprint of a document only includes those topic categories whose associated weights are among a selected number of highest associated weights for that document.
  - 34. The method of claim 31, wherein a selected number of weights is between 1 and 75 and the classification system is the Dewey Decimal System.
  - 35. The method of claim 31, wherein a level of precision is set by setting a number of digits in a classification code of the classification system.
  - 71. A method according to claim 31 or 36 or 40 or 55 or 59 or 63 or 67, wherein a second user fingerprint is maintained, the second user fingerprint constructed identically to the user fingerprint except that for the second user fingerprint the text or texts recently accessed by the user is more recently accessed by the user than the text or texts recently accessed by the user for the user fingerprint, calculating a mathematical overlap between the user fingerprint and the second user fingerprint, the second user fingerprint superseding the user fingerprint when said mathematical overlap exceeds a set value.

36. A method of re-ranking a list of documents obtained from a search wherein a ranking of a document in the list is determined by a relevance of the document to a search text, and wherein each document in the list has a fingerprint, said fingerprint comprising a list of weights associated with particular topic categories in a classification system, each of the weights representing a degree to which the document relates to the particular topic category that the weight is associated with, the weights obtained automatically from a computer program, the method performed by a computer processor, having a scalable time complexity of O(N^x) where 0<
- =X<
  
  =1.0, and comprising;
  
  providing a user fingerprint, the user fingerprint comprising a list of cumulative weights associated with particular topic categories in the classification system, each of the cumulative weights representing a degree to which text or texts in a link recently accessed by a user relate to the particular topic category that the cumulative weight is associated with, the cumulative weights obtained from weights that in turn were obtained automatically from a computer program,searching the list of documents by comparing the user fingerprint with the fingerprint for each document in the list of documents, andre-ranking the list of documents based on a degree to which a fingerprint of a document in the list has a mathematical overlap with the user fingerprint, the method configured to re-rank the list of documents based on relevance to the search text whether the list of documents includes text written in one language or in more than one language.
- View Dependent Claims (37, 38, 39)
- - 37. The method of claim 36, wherein either the weights are scaled or else the mathematical overlaps are scaled.
  - 38. The method of claim 37, wherein a fingerprint of a document only includes those topic categories whose associated weights are among a selected number of highest associated weights for that document.
  - 39. The method of claim 36, wherein a selected number of weights is between 1 and 75 and the classification system is the Dewey Decimal System.

40. A method of re-ranking a list of documents obtained from a search wherein a ranking of a document in the list is determined by a relevance of the document to a search text, wherein the list of documents has been classified and appears in an inverted list, said inverted list comprising for each topic category of a classification system a weight associated with a particular document of the list of documents, the weight representing a degree to which the particular document relates to said each topic category, the weights obtained automatically from a computer program, the method performed by a computer processor, having a scalable time complexity of O(N^x) where 0<
- =X<
  
  =1.0, and comprising;
  
  providing a user fingerprint, the user fingerprint comprising a list of cumulative weights associated with particular topic categories in the classification system, each of the cumulative weights representing a degree to which text or texts in a link recently accessed by a user relates to the particular topic category that the cumulative weight is associated with, the cumulative weights obtained from weights that in turn were obtained automatically from a computer program,searching the inverted list by comparing the user fingerprint with the fingerprint for each document in the inverted list, andre-ranking the list of documents by making use of a degree to which a fingerprint of a document in the inverted list has a mathematical overlap with the user fingerprint the method configured to re-rank the list of documents based on relevance to the search text whether the list of documents includes text written in one language or in more than one language.
- View Dependent Claims (41, 42, 43, 44)
- - 41. The method of claim 40, wherein either the weights are scaled or else the mathematical overlaps are scaled.
  - 42. The method of claim 41, wherein an upper bound to the mathematical overlap is calculated dynamically, said upper bound used to reduce a magnitude of documents for which the mathematical overlap is calculated.
  - 43. The method of claim 41, wherein a fingerprint of a document only includes those topic categories whose associated weights are among a selected number of highest associated weights for that document.
  - 44. The method of claim 40, wherein a selected number of weights is between 1 and 75 and the classification system is the Dewey Decimal System.

45. A method of locating relevant documents within a universe of documents, the documents of said universe having been classified and appears in an inverted list,said inverted list comprising for each category of a classification system a weight associated with a particular document of the list of documents, each of the weights representing a degree to which the particular document relates to said each topic category, the weights obtained automatically from a computer program, the method having a scalable time complexity of O(N^x) where 0<
- =X<
  
  =1.0 for a universe of text on the world wide web and comprising;
  
  creating a fingerprint for a first piece of text, the fingerprint comprising a list of weights associated with particular topic categories in the classification system, each of the weights in the fingerprint for said first piece of text representing a degree to which the first piece of text relates to the particular topic category that the weight in the fingerprint for said piece of text is associated with, the weights in the fingerprint for said first piece of text obtained automatically from a computer program,a computer processor searching all or a portion of the universe of documents by comparing the fingerprint for the first piece of text with the fingerprint for each document in that all or a portion of the universe of documents, and ranking the all or a portion of the universe of documents by making use of a degree to which a fingerprint of a document in the inverted list has a mathematical overlap with the fingerprint of the first piece of text, the method configured to locate the relevant documents within the universe of documents whether the universe of documents includes text written in one language or in more than one language.
- View Dependent Claims (46, 47, 48, 49)
- - 46. The method of claim 45, wherein either the weights are scaled or else the mathematical overlaps are scaled.
  - 47. The method of claim 46, wherein an upper bound to the mathematical overlap is calculated dynamically, said upper bound used to reduce a magnitude of documents for which the mathematical overlap is calculated.
  - 48. The method of claim 46, wherein (i) a fingerprint of a document, (ii) a fingerprint of the first piece of text or (iii) a fingerprint of a document and a fingerprint of the first piece of text only includes those topic categories whose associated weights are among a selected number of highest associated weights for that document.
  - 49. The method of claim 45, wherein a selected number of weights is between 1 and 75 and the classification system is the Dewey Decimal System.

50. A method of locating relevant documents within a universe of documents, the documents of said universe having been classified and appears in an inverted list,said inverted list comprising for each topic category of a classification system a weight associated with a particular document of the list of documents, each of the weights representing a degree to which the particular document relates to said each topic category, the weights obtained automatically from a computer program, the method having a scalable time complexity of O(N^x) where 0<
- =X<
  
  =1.0 for a universe of text on the world wide web and comprising;
  
  a computer processor creating a fingerprint for a first piece of text, the fingerprint comprising a list of weights associated with particular topic categories in the classification system, each of the weights in the fingerprint for said first piece of text representing a degree to which the first piece of text relates to the particular topic category that the weight in the fingerprint for said first piece of text is associated with, the weights in the fingerprint for said first piece of text obtained automatically from a computer program,a computer processor searching all or a portion of the universe of documents by comparing the fingerprint for the first piece of text with the fingerprint for each document in that all or a portion of the universe of documents, and selecting those documents whose fingerprints have a predetermined degree of mathematical overlap with the fingerprint of the first piece of text, the method configured to locate the relevant documents within the universe of documents whether the universe of documents includes text written in one language or in more than one language.
- View Dependent Claims (51, 52, 53, 54)
- - 51. The method of claim 50, wherein either the weights are scaled or else the mathematical overlaps are scaled.
  - 52. The method of claim 51, wherein an upper bound to the mathematical overlap is calculated dynamically, said upper bound used to reduce a magnitude of documents for which the mathematical overlap is calculated.
  - 53. The method of claim 51, wherein (i) a fingerprint of a document, (ii) a fingerprint of the first piece of text or (iii) a fingerprint of a document and a fingerprint of the first piece of text only includes those topic categories whose associated weights are among a selected number of highest associated weights for that document.
  - 54. The method of claim 50, wherein a selected number of weights is between 1 and 75 and the classification system is the Dewey Decimal System.

55. A method of locating relevant documents within a universe of documents, the documents of said universe having been classified so that each document in the universe has a fingerprint, said fingerprint comprising a list of weights associated with particular topic categories in a classification system, each of the weights representing a degree to which the document relates to the particular topic category that the weight is associated with, the weights obtained automatically from a computer program, the method having a scalable time complexity of O(N^x) where 0<
- =X<
  
  =1.0 for a universe of text on the world wide web and performed by a computer processor and comprising;
  
  providing a user fingerprint, the user fingerprint comprising a list of cumulative weights associated with particular topic categories in the classification system, each of the cumulative weights representing a degree to which text of texts in a link recently accessed by a user related to the particular topic category that the cumulative weight is associated with, the cumulative weights obtained from weights that in turn were obtained automatically from a computer program, searching all or a portion of the universe of documents by comparing the user fingerprint with the fingerprint for each document in that all or a portion of the universe of documents, andranking the all or a portion of the universe of documents based on a degree to which a document has a mathematical overlap with the user fingerprint, the method configured to locate the relevant documents within the universe of documents whether the universe of documents includes text written in one language or in more than one language.
- View Dependent Claims (56, 57, 58)
- - 56. The method of claim 55, wherein either the weights are scaled or else the mathematical overlaps are scaled.
  - 57. The method of claim 56, wherein a fingerprint of a document only includes those topic categories whose associated weights are among a selected number of highest associated weights for that document.
  - 58. The method of claim 55, wherein a selected number of weights is between 1 and 75 and the classification system is the Dewey Decimal System.

59. A method of locating relevant documents within a universe of documents, the documents of said universe having been classified so that each document in the universe has a fingerprint, said fingerprint comprising a list of weights associated with particular topic categories in a classification system, each of the weights representing a degree to which the document relates to the particular topic category that the weight is associated with, the weights obtained automatically from a computer program, the method having a scalable time complexity of O(N^x) where 0<
- =X<
  
  =1.0 for a universe of text on the world wide web and performed by a computer processor comprising;
  
  providing a user fingerprint, the user fingerprint comprising a list of cumulative weights associated with particular topic categories in the classification system, each of the cumulative weights representing a degree to which text or texts in a link recently accessed by a user relates to the particular topic category that the cumulative weight is associated with, the cumulative weights obtained from weights that in turn were obtained automatically from a computer program,searching all or a portion of the universe of documents by comparing the user fingerprint with the fingerprint for each document in that all or a portion of the universe of documents, andselecting those documents whose fingerprints have a predetermined degree of mathematical overlap with the user fingerprint, the method configured to locate the relevant documents within the universe of documents whether the universe of documents includes text written in one language or in more than one language.
- View Dependent Claims (60, 61, 62)
- - 60. The method of claim 59, wherein either the weights are scaled or else the mathematical overlaps are scaled.
  - 61. The method of claim 60, wherein a fingerprint of a document only includes those topic categories whose associated weights are among a selected number of highest associated weights for that document.
  - 62. The method of claim 59, wherein a selected number of weights is between 1 and 75 and the classification system is the Dewey Decimal System.

63. A method performed by a computer processor of locating relevant documents within a universe of documents, comprising:
- creating a fingerprint for each document in the universe of documents, said fingerprint comprising a list of weights associated with particular topic categories in a classification system, each of the weights representing a degree to which the document relates to the particular topic category that the weight is associated with, the weights obtained automatically from a computer program, the method having a scalable time complexity of O(N^x) where 0<
  
  =X<
  
  =1.0 for a universe of text on the world wide web and performed by a computer processor comprising;
  
  providing a user fingerprint, the user fingerprint comprising a list of cumulative weights associated with particular topic categories in the classification system, each of the cumulative weights representing a degree to which text or texts in a link recently accessed by a user relates to the particular topic category that the cumulative weight is associated with, the cumulative weights obtained from weights that in turn were obtained automatically from a computer program,searching all or a portion of the universe of documents by comparing the user fingerprint with the fingerprint for each document in that all or a portion of the universe of documents, andranking the all or a portion of the universe of documents based on a degree to which a document has a mathematical overlap with the user fingerprint, the method configured to locate the relevant documents within the universe of documents whether the universe of documents includes text written in one language or in more than one language.
- View Dependent Claims (64, 65, 66)
- - 64. The method of claim 63, wherein either the weights are scaled or else the mathematical overlaps are scaled.
  - 65. The method of claim 64, wherein a fingerprint of a document only includes those topic categories whose associated weights are among a selected number of highest associated weights for that document.
  - 66. The method of claim 63, wherein a selected number of weights is between 1 and 75 and the classification system is the Dewey Decimal System.

67. A method of locating relevant documents within a universe of documents, the method performed by a computer processor and having a scalable time complexity of O(N^x) where 0<
- =X<
  
  =1.0 for a universe of text on the world wide web, the method comprising;
  
  creating a fingerprint for each document in the universe of documents, said fingerprint comprising a list of weights associated with particular topic categories in a classification system, each of the weights representing a degree to which the document relates to the particular topic category that the weight is associated with, the weights obtained automatically from a computer program,providing a user fingerprint, the user fingerprint comprising a list of cumulative weights associated with particular topic categories in the classification system, each of the cumulative weights representing a degree to which text or texts in a link recently accessed by a user relates to the particular topic category that the cumulative weight is associated with, the cumulative weights obtained from weights that in turn were obtained automatically from a computer program,searching all or a portion of the universe of documents by comparing the user fingerprint with the fingerprint for each document in that all or a portion of the universe of documents, andselecting those documents whose fingerprints have a predetermined degree of mathematical overlap with the user fingerprint, the method configured to locate the relevant documents within the universe of documents whether the universe of documents includes text written in one language or in more than one language.
- View Dependent Claims (68, 69, 70)
- - 68. The method of claim 67, wherein either the weights are scaled or else the mathematical overlaps are scaled.
  - 69. The method of claim 68, wherein a fingerprint of a document only includes those topic categories whose associated weights are among a selected number of highest associated weights for that document.
  - 70. The method of claim 67, wherein a selected number of weights is between 1 and 75 and the classification system is the Dewey Decimal System.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Needlebot, Inc.
Original Assignee
Needlebot, Inc.
Inventors
Verlin, Jerome, Akyuz, Can Deniz, Donnelly, Stuart, Collins, John Barrett
Primary Examiner(s)
STACE, BRENT S

Application Number

US11/731,751
Publication Number

US 20070239707A1
Time in Patent Office

1,628 Days
Field of Search

707 1- 3, 707 5- 7, 707100-102, 707/728, 707/731, 707/723, 707/722, 707/705
US Class Current

707/728
CPC Class Codes

G06F 16/35 Clustering; Classification

Method of searching text to find relevant content

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

73 Claims

Specification

Solutions

Use Cases

Quick Links

Method of searching text to find relevant content

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

73 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links