×

Concept-based searching of unstructured objects

  • US 7,890,514 B1
  • Filed: 12/05/2005
  • Issued: 02/15/2011
  • Est. Priority Date: 05/07/2001
  • Status: Expired due to Term
First Claim
Patent Images

1. A computer-readable storage medium, comprising code representing instructions to cause a processor to:

  • identify a plurality of concepts present in an unstructured object present in a corpus of unstructured objects;

    define a Gaussian distribution representing a number of occurrences of each concept in the plurality of concepts present in the unstructured object;

    calculate a weighted value for a first concept from the plurality of concepts, the weighted value being based at least in part on at least one of;

    a number of occurrences of the first concept in the unstructured object;

    a ratio of a number of categories in which the first concept occurs to a total number of all categories;

    a ratio of a frequency of occurrence of the first concept in the unstructured object to a frequency of occurrence of the first concept in the corpus;

    ora ratio of a number of occurrences of the first concept in the unstructured object to a total number of all concepts, including the plurality of concepts, that occur in the unstructured object;

    determine that the weighted value is greater than a first threshold value and less than a second threshold value, the first threshold value being five or fewer standard deviations below a mean weighted value of the Gaussian distribution, the second threshold value being five or fewer standard deviations above the mean weighted value of the Gaussian distribution; and

    identify the first concept as a key concept associated with the unstructured object, the key concept representing a meaning of the unstructured object.

View all claims
  • 8 Assignments
Timeline View
Assignment View
    ×
    ×