×

System and method for identifying passages in electronic documents

  • US 9,645,988 B1
  • Filed: 08/25/2016
  • Issued: 05/09/2017
  • Est. Priority Date: 08/25/2016
  • Status: Active Grant
First Claim
Patent Images

1. A method for searching an electronic document for passages relating to a concept being searched for, where the concept is expressed as a word or plurality of words, the method comprising:

  • deconstructing by a computer processor training electronic texts stored on a computer readable into a stream of features;

    storing the stream of features in a data store;

    wherein the features include the text of complete sentences, tokens used by the text in each sentence, the sequence of sentences, layout of text and typography of text;

    executing by a computer processor a conditional random field algorithm to label sentences in the electronic document as either being relevant to the concept being searched for (“

    State A”

    ) or as background information (“

    State B”

    ) based on the stream of features;

    executing by the computer processor a search algorithm which returns those sentences labelled as State A;

    wherein the conditional random field algorithm generates a probability of a sentence being relevant to State A;

    wherein the probability includes a tolerance for words or portions of words which cannot be resolved into computer-readable text;

    wherein, given a document containing multiple sentences S;

    ={s1, s2, . . . , sm} and the corresponding concept label for each sentence Concept;

    ={concept1, concept2, . . . , conceptm}, the conditional random field function defining the probability of the Concept applied to S, Pr(Concept|S), is expressed as;

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×