User-context-based search engine
First Claim
1. A method for creating a macro-context for a plurality of terms, the method comprising:
- identifying a plurality of domains, each domain thereof corresponding to a subject matter unique thereto;
creating a plurality of domain lists, each domain list pertaining exclusively to a domain of the plurality of domains and comprising domain terms corresponding substantially exclusively to the subject matter of the domain to which the domain list pertains;
identifying a corpus of information divided into a plurality of terms and a plurality of topical entries, each term of the plurality of terms corresponding to a topical entry of the plurality of topical entries;
counting, by a computer apparatus within each topical entry of the plurality of topical entries, occurrences of domain terms from each domain list of the plurality of domain lists; and
calculating, by the computer apparatus, a macro-context for each term of the plurality of terms, the macro-context comprising a vector coupling each domain of the plurality of domains with a weight derived from the counting of occurrences of the domain terms corresponding thereto.
3 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for determining contexts of information analyzed. Contexts may be determined for words, expressions, and other combinations of words in bodies of knowledge such as encyclopedias. Analysis of use provides a division of the universe of communication or information into domains, and selects words or expressions unique to those domains of subject matter as an aid in classifying information. A vocabulary list is created with a macro-context (context vector) for each, dependent upon the number of occurrences of unique terms from a domain, over each of the domains. This system may be used to find information or classify information by subsequent inputs of text, in calculation of macro-contexts, with ultimate determination of lists of micro-contests including terms closely aligned with the subject matter.
85 Citations
14 Claims
-
1. A method for creating a macro-context for a plurality of terms, the method comprising:
-
identifying a plurality of domains, each domain thereof corresponding to a subject matter unique thereto; creating a plurality of domain lists, each domain list pertaining exclusively to a domain of the plurality of domains and comprising domain terms corresponding substantially exclusively to the subject matter of the domain to which the domain list pertains; identifying a corpus of information divided into a plurality of terms and a plurality of topical entries, each term of the plurality of terms corresponding to a topical entry of the plurality of topical entries; counting, by a computer apparatus within each topical entry of the plurality of topical entries, occurrences of domain terms from each domain list of the plurality of domain lists; and calculating, by the computer apparatus, a macro-context for each term of the plurality of terms, the macro-context comprising a vector coupling each domain of the plurality of domains with a weight derived from the counting of occurrences of the domain terms corresponding thereto.
-
-
2. A method for classifying text, the method comprising:
-
identifying a plurality of terms, each comprising at least one of a word, a name, and a phrase; the identifying, wherein each term of the plurality of terms is coupled to a macro-context characterizing the context of the term; the identifying, wherein the macro-context comprises a vector mapping a plurality of subject matters, each unique, to a corresponding plurality of weights, each weight reflecting a contribution of a corresponding subject matter of the plurality of subject matters to the term; selecting an input text to be classified; locating, by a computer apparatus, a set of contained terms, each reflecting occurrence of one of the terms of the plurality of terms within the input text; calculating, by the computer apparatus, a composite macro-context characterizing the context of the input text; the calculating, wherein the composite macro-context comprises a vector mapping the plurality of subject matters to corresponding weights reflecting contributions of corresponding subject matters of the plurality of subject matters to the input text; the calculating, comprising adding together the macro-contexts of the contained terms of the set to define the composite macro-context; and classifying, by the computer apparatus, the input text by linking the composite macro-context thereto. - View Dependent Claims (3, 4, 5, 6, 7)
-
-
8. A method for searching, the method comprising:
-
identifying a repository of information comprising prose subdivided into a plurality of sections; determining, by a computer apparatus, a macro-context for each section of the plurality of sections, the macro-context characterizing the context of the section corresponding thereto; the determining, wherein each macro-context comprises a vector mapping a plurality of subject matters, each unique, to a corresponding plurality of weights, each weight reflecting a contribution of a corresponding subject matter of the plurality of subject matters to the section corresponding to the macro-context; selecting, by the computer apparatus, a micro-context for each section of the plurality of sections, the micro-context characterizing the context of the section corresponding thereto; the selecting, comprising locating a set of terms contained within the section corresponding to the micro-context, each term of the set having a macro-context comprising a vector mapping the plurality of subject matters to corresponding weights reflecting contributions of corresponding subject matters, of the plurality of subject matters, to the term; the selecting, wherein the micro-context comprises selected terms from the set, the selected terms each having a macro-context within a selected mathematical proximity to the macro-context of the section corresponding thereto; generating a database by indexing each section of the plurality of sections according to the macro-context and micro-context corresponding thereto; receiving into the computer apparatus a query from a user; determining a macro-context and a micro-context for the query; determining a threshold criterion for a search corresponding to the query; locating, by the computer apparatus, one or more sections in the database by searching in the database; the locating, wherein each section of the one or more sections has a macro-context and a micro-context meeting the threshold criterion; and presenting, by the computer apparatus, the one or more sections to a user. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
Specification